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Applicant herewith submits to the United States Designated/Elected Office (DO/EO/US) the following items and other information: 

1 . EI This is a FIRST submission of items concerning a filing under 35 U.S.C. 371 . 

2. □ This is a SECOND or SUBSEQUENT submission of items concerning a filing under 35 U.S.C. 371. 

3. EI This is an express request to promptly begin national examination procedures (35 U.S.C. 371(f)). 

4. E! The US has been elected by the expiration of 19 months from the earliest claimed priority date (PCT Article3 1). 

5. El A copy of the International Application as filed (35 U.S.C. 371(c)(2)). 

a. □ is transmitted herewith (required only if not transmitted by the International Bureau). 

b. IS! has been transmitted by the International Bureau. , 

c. □ is not required, as the application was filed in the United States Receiving Office (RO/US). 

6. □ An English language translation of the International Application as filed (35 U.S.C. 371(c)(2)). 

a. □ is transmitted herewith (required only if not transmitted by the International Bureau). 

b. □ has been transmitted by the International Bureau. 

7. El Amendments to the claims of the International Application under PCT Article 19 (35 U.S.C. 371(c)(3)). 

a. □ are attached hereto (required only if not transmitted by the International Bureau). 

b. □ have been communicated by the International Bureau. 

c. □ have not been made; however, the time limit for making such amendments has NOT expired. 

d. EJ have not been made and will not be made. 

8. □ An English language translation of the amendments to the claims under PCT Article 19 (35 U.S.C. 371(c)(3)). 

9. □ An oath or declaration of the inventor(s) (35 U.S.C. 371(c)(4)). 

1 0. □ An English language translation of the annexes to the International Preliminary Examination Report under 

PCT Article 36 (35 U.S.C. 371(C)(5)). 
Items 11. To 16. Below concern documents) or information included: 

11. □ An Information Disclosure Statement under 37 CFR 1 .97 and 1 .98. 

12. □ An assignment document for recording. A separate cover sheet in compliance with 37 CFR 3.28 and 3.3 1 is included. 



13. 


El 


A 


14. 


□ 


A 


15. 


n 


A 


16. 


□ 


A 



1 7. El A computer-readable form of the sequence listing in accordance with PCT Rule 13ter.2 and 35 U.S.C. 1 .821-1.825. 

18. □ A second copy of the published international application under 35 U.S.C. 154(d)(4). 

19. □ A second copy of the English language translation of the international application under 35 U.S.C. 154(d)(4). 

20. 13 Other items or information: 
Copy ofPCT/RO/101 form 

Copy of PCT Published Application with Search Report 
PCT/IB/308 form 

Copy of International Preliminary Examination Report 

Statement Pursuant to 37 CFR 1 .821(f) w/ diskette and paper copy of sequence listing 
Express Mail Label No. EL819461955US 
Date Mailed: July 27 , 2001 
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U.S. APPLICA^N^J) <^kngi. <^ 3 Q ^ 


INTERNATIONAL APPLICATION 
PCT/GBOO/00263 


ATTORNEY'S DOCKET NUMBER 

B0 192/7031 


21 . El The following fees are submitted: 

BASIC NATIONAL FEE (37 CFR 1.492(a)(l)-(5)): 

Neither international preliminary examination fee (37 CFR 1.482) 
nor international search fee (37 CFR 1 .445(a)(2)) paid to USPTO 

and International Search Report not prepared by the EPO or JPO $1000.00 
International preliminary examination fee (37 CFR 1 .482) not paid to 

USPTO but International Search Report prepared by the EPO or JPO $860.00 

International preliminary examination fee (37 CFR 1 482) not naid to USPTO hut 

but international search fee paid to USPTO (37 CFR 1 .445(a)(2)).paid to USPTO $71 0.00 

International preliminary examination fee paid to USPTO (37 CFR 1.482) 

But all claims did not satisfy provisions of PCT Article 33(1 )-(4) $690.00 

International preliminary examination fee paid to USPTO (37 CFR 1.482) 
and all claims satisfied provisions of PCT Article 33(l)-(4) $100 00 
ENTER APPROPRIATE BASIC FEE AMOUNT =860.00 


CALCULATIONS HiUU ^ 0NLY 


Surcharge of $130.00 for furnishing the oath or declaration later than □ 20 □ 30 
months from the earliest claimed priority date (37 CFR 1.492(e)). 


$ 




CLAIMS 


NUMBER FILED 


NUMBER EXTRA 


RATE 






Total Claims 


34-20 - 


14 


X $18.00 


$252.00 




Independent Claims 


6-3 = 


3 


X $80.00 


$240.00 




MULllFLE DEPENDENT CLAIM(S) (if applicable) 


+$270.00 


$ 




TOTAL OF ABOVE CALCULATIONS 


$1352.00 




Applicant claims small entity status. See 37 CFR 1 .27. The fees indicated above are reduced by 


$ 




SUBTOTAL 


$1352.00 




Processing fee of $130.00 for furnishing the English translation later than □ 20 □ 30 
months from the earliest claimed priority date (37 CFR 1.492(f)). 


$ 




TOTAL NATIONAL FEE 


$1352.00 




Fee for recording the enclosed assignment (37 CFR 1.21(h)). The assignment must be 
accompanied by an appropriate coversheet (37 CFR 3.28, 3.3 1). $40.00 per property + 


$ 




TOTAL FEES ENCLOSED 


$1352.00 






Amount to be: 
refunded 


$ 




charged 


$ 



a. El A check in the amount of $ 1352.00 



To cover the above fees is enclosed. 

In the amount of $ To cover the above fees. 



b. □ Please charge my Deposit Account No. 

A duplicate copy of this sheet is enclosed. 

c. m The commissioner is hereby authorized to charge any additional fees which may be required, or credit any overpayment to Deposit 

Account No. 23/2825. A duplicate of this sheet is enclosed. 

d. □ Fees are to be charged to a credit card. WARNING: Information on this form may become public. Credit card information should 

not be included on this form. Provide credit card information and authorization on PTO-2038. 

NOTE: Where an appropriate time limit under 37 CFR 1.494 or 1.495 has not been met, a petition to revive (37 CFR 1.137(a) or (b) must be filed 
and granted to restore the application to pending status. ' 



SEND ALL CORRESPONDENCE TO 

WOLF, GREENFIELD & SACKS, P.C. 
600 Atlantic Avenue 
Boston, Massachusetts 02210 
Tel: (617)720-3500 



CUSTOMER NUMBER 




John R. Van Amsterdam 

NAME 



40.212 



REGISTRATION NO 



23628 



FormPTO- 1390 (REV 11-2000) 
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21 . S The following fees are submitted: 

BASIC NATIONAL FEE (37 CFR 1.492(a)(l)-(5)): 

Neither international preliminary examination fee (37 CFR 1.482) 
nor international search fee (37 CFR 1.445(a)(2)) paid to USPTO 

and International Search Report not prepared by the EPO or JPO $1000.00 
International preliminary examination fee (37 CFR 1.482) not paid to 

USPTO but International Search Report prepared by the EPO or JPO $860.00 

International preliminary examination fee (37 CFR 1.482) not paid to USPTO but 

but international search fee paid to USPTO (37 CFR 1.445(a)(2)).paid to USPTO $710.00 

International preliminary examination fee paid to USPTO (37 CFR 1.482) 

But all claims did not satisfy provisions of PCT Article 33(l)-(4) $690.00 

International preliminary examination fee paid to USPTO (37 CFR 1.482) 
and all claims satisfied provisions of PCT Article 33(l)-(4) $100 00 
ENTER APPROPRIATE BASIC FEE AMOUNT -860.00 



Surcharge of $130.00 for furnishing the oath or declaration later than □ 20 □ 30 
months from the earliest claimed priority date (37 CFR 1 . 492(e)). 
CLAIMS 



Total Claims 



NUMBER FILED 



34-20 



NUMBER EXTRA 



14 



RATE 



X $18.00 



ATTORNEY'S DOCKET NUMBER 

B0 192/7031 



CALCULATIONS Plu USb UNL ' 



$252.00 



Independent Claims 



6-3 



MULTIPLE DEPENDENT CLAIM(S) (if applicable) 



X $80.00 



$240.00 



+$270.00 



Applicant claims small entity status. 
Vi. 



TOTAL OF ABOVE CALCULATIONS 



$ 



See 37 CFR 1.27. The fees indicated above are reduced by 



$1352.00 



SUBTOTAL 



Processing fee of $130.00 for furnishing the English translation later than □ 20 □ 30 
months from the earliest claimed priority date (37 CFR 1.492(f)). 



$1352.00 
$ 



TOTAL NATIONAL FEE 



Fee for recording the enclosed assignment (37 CFR 1.21(h)). The assignment must be 
accompanied by an appropriate coversheet (37 CFR 3.28, 3.31). $40.00 per property + 



$1352.00 



TOTAL FEES ENCLOSED 



$1352.00 



Amount to be: 
refunded 



charged $ 



a. 



A check in the amount of $ 1352.00 



To cover the above fees is enclosed. 



b. □ Please charge my Deposit Account No. 



In the amount of $ 



To cover the above fees. 



A duplicate copy of this sheet is enclosed. 

The commissioner is hereby authorized to charge any additional fees which may be required, or credit any overpayment to Deposit 
Account No. 23/2825. A duplicate of this sheet is enclosed. 



d. □ Fees are to be charged to a credit card. WARNING: Information on this form may become public. Credit card information should 
not be included on this form. Provide credit card information and authorization on PTO-2038. 

NOTE: Where an appropriate time limit under 37 CFR 1.494 or 1.495 has not been met, a petition to revive (37 CFR 1.137(a) or (b) must be filed 
and granted to restore the application to pending status. 



SEND ALL CORRESPONDENCE TO. 

WOLF, GREENFIELD & SACKS, P.C. ^John R. Van Amsterdam 

600 Atlantic Avenue name ~ 
Boston, Massachusetts 02210 

Tel: (617)720-3500 40212 
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Attorney's Docket No. B0192/7031 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 




Applicants : Bramleyetal. 

U.S. Serial No. : 09/890,229 

International Application No. : PCT/GB00/00263 

International Filing Date : 28 January 2000 (28.0 1 .00) 

Earliest Priority Date : 28 January 1 999 (28.0 1 .99) 



MANIPULATING ISOPRENOID EXPRESSION 



Examiner : Unknown 

Art Unit : 1651 



CERTIFICATE OF EXPRESS MAILING UNDER 37 C.F.R. §1.8(a) 



The undersigned hereby certifies that this document is being placed in the United States mail via express 
mail, addressed to Box PCT, U.S. Patent and Trademark Office, P.O. Box 2327, Arlington, VA 22202 on the 27th 
day of November, 2001. , ^ 



The Notification of Missing Requirements which was mailed on September 6, 2001, 
states that claims 12, 13, and 26 are multiple dependent claims, and as such the multiple 
dependent claim fee of $270.00 is required for these claims. However, this seems to be an error. 
Upon review of the application, Applicants found that claims 12, 13, and 26 were not multiple 
dependent claims. Therefore, Applicants believe there is no additional claim fee required in this 
application. 

If any fee is determined to be required, please charge the balance to the account of the 
undersigned, deposit account number 23/2825. 



Monica 




Box PCT 

U.S. Patent and Trademark Office 
P.O. Box 2327 
Arlington, VA 22202 



COMMUNICATION 



Dear Sir: 



Respectfully submitted, 




WOLF, GREENFIELD & SACKS, P.C. 
600 Atlantic Avenue 
Boston, MA 02210-2211 
Tel. (617) 720-3500 



Attorney's Docket No.: B0192/703 1 
Date: November 27, 2001 
xl2/06/01 
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CRF Errors Corrected by the STIC Systems Branch 
Serial Number: C c l j H 7 C 2 2^ EdifeTbT^ 9 

| | Changed a file from non-ASCII to ASCII Verified by: _ (STIC sta 

I 1 Changed the margins in cases where the sequence text was Vrapped" down to the next line. 

| 1 Edited a format error in the Current Application Data section, specifically: 



□ Edited the Current Application Data section with the actual current number. The number inputted by the 
applicant was □ the prior application data; orQ other E NTE R E 

I 1 Added the mandatory heading and subheadings for "Current Application Data". 

□ Edited the "Number of Sequences" field. The applicant spelled out a number'instead of using an integer. 
I 1 Changed the spelling of a mandatory field (the headings or subheadings), specifically: 



Corrected (he SEQ ID NO when obviously incorrect. The sequence numbers that were edited 



were: 



[^2 Inserted or corrected a nucleic number at the end of a nucleic line. SEQ ID NO's edited: 

I 1 Corrected subheading placement. All responses must be on the same line as each subheading If the 

applicant placed a response below the subheading, this was moved to its appropriate place. 

| | Inserted colons after headings/subheadings. Headings edited included: 



1 | Deleted extra, invalid, headings used by an applicant, specifically: 



Deleted: &*ffon-ASCII "garbage" at the beginning/end of files; □ secretary inrtiais/fiiename at end of file 
□ Page numbers throughout text; □ other invalid text, such as 

! 1 Inserted mandatory headings, specifically: 



□ 

Corrected an obvious error in the response, specifically: 



I I Edited identifiers where upper case is used but lower case is required, or vice versa. 

□ 

Corrected an error in the Number of Sequences field, specifically: 



Q A "Hard Page Break" code was inserted by the applicant. All occurrences had to be deleted. 

Q Deleted ending stop codon in amino acid sequences and adjusted the "(A)Length:" field accordingly (error 
due to a Patentln bug). Sequences corrected: 

I I Other: 



Examiner: The above corrections must be communicated to the applicant in the first Office 
Actfon. DO NOT send a copy of this form. 3/1/95 
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Attorney's Docket No. B0192/7031 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicants 



Bramley et al. 

09/890,229 

PCT/GB00/00263 

28 January 2000 (28.01.00) 

28 January 1999(28.01.99) 

MANIPULATING ISOPRENOID EXPRESSION 

Unknown 

1651 



U.S. Serial No. 



International Application No. 
International Filing Date 
Earliest Priority Date 
Title 

Examiner 
Art Unit 



CERTIFICATE OF EXPRESS MAILING UNDER 37 C.F.R. §1.8(a) 



The undersigned hereby certifies that this document is being placed in the United States mail via express 
mail, addressed to Box PCT, U.S. Patent and Trademark Office, P.O. Box 2327, Arlington, VA 22202, on the 27th 
day of November, 2001. >/ _ 



Box PCT 

U.S. Patent and Trademark Office 
P.O. Box 2327 
Arlington, VA 22202 

STATEMENT PURSUANT TO 37 CFR 1.821(f) AND 37 CFR 1.825 (a) and (b) 

This statement is made pursuant to 37 CFR 1 .821 (f), and 37 CFR 1 .825 (a) and (b). 
Applicants submit herewith a substitute copy of the written sequence listing and a computer 
readable diskette to comply with the sequence requirements. 

Applicants' attorney states that the information recorded in computer readable form is 
identical to the written sequence listing and that neither the computer readable form nor the 
written sequence listing contain new matter. 




Respectfully submitted, 




600 Atlantic Avenue 



Boston, MA 02210-2211 
Tel. (617) 720-3500 



Attorney's Docket No.: B0192/7031 
Date: November 27, 2001 
X12/06/01 
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Attorney's Docket No: BO 192/7031 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicants 

International Application No. 
International Filing Date 
Earliest Priority Date 
Title 



Peter BRAMLEY et al. 

PCT/GB00/00263 

28 January 2000 (28.01.00) 

28 January 1999 (28.01.99) 

MANIPULATING ISOPRENOID EXPRESSION 



Commissioner for Patents 
Box PCT 

Washington, DC 20231 

PRELIMINARY AMENDMENT 

Sir: 

Please amend the United States national phase application of the above-identified PCT 
international application as follows. 

In the Specification 

Please amend the specification as follows: 

Please add the following section as the first section of the specification following the 

title. 

Related Applications 

This application is a national stage filing under 35 U.S.C. § 371 of PCT International 
application PCT/GB00/00263, and filed 28 January 2000, which was published under PCT 
Article 21(2) in English. 

Foreign priority benefits are claimed under 35 U.S.C. § 1 19(a)-(d) of Great Britain 
application number GB 9901902.8, filed 28 January 1999. 

Please substitute paragraphs of the specification as follows. 

Substitute the following paragraph for the paragraph that begins on line 33, page 13 of 
the specification as filed. 



549288.1 
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Figure 3 : is an illustration of an amino acid sequence alignment of DXP synthases used in 
the present invention, Synechocystis sp. 6803 (S.s) (GenBank D90903) (SEQ ID 
NO: 1), B. subtilis (B.s) (GenBank D84432) (SEQ ID NO: 2) and E coli (E.c) 
(GenBank AF035440) (SEQ ID NO: 3). The consensus line (consen) shows 
residues conserved in all three sequences (upper case letters) or residues which 
are identical in two sequences and replaced by an equivalent amino acid in the 
third sequence (+). The conserved histidine domain putatively involved in proton 
transfer is over lined and numbered 1. The second over lined domain (2) denotes 
the consensus thiamin pyrophosphate (TPP)-binding motif. 

Substitute the following paragraph for the paragraph that begins on line 35, page 14 of 
the specification as filed. 

Figure 6: is a diagrammatic illustration of vector pVB6_TSEC_LML (SEQ ID NO: 6). 

Substitute the following paragraph for the paragraph that begins on line 3, page 15 of the 
specification as filed. 

Figure 7: is a diagrammatic representation of plasmid pVB6_35S_TSEC_LML (SEQ ID 
NO: 5). 

Substitute the following paragraph for the paragraph that begins on line 6, page 15 of the 
specification as filed. 

Figure 8: is an illustration of the amino acid sequence of K coli DXPS (SEQ ID NO: 3). 

Substitute the following paragraph for the paragraph that begins on line 9, page 15 of the 
specification as filed. 

Figure 9: is an illustration of the transit peptide used in tomato plants (SEQ ID NO: 4). 
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Substitute the following paragraph for the paragraph that begins on line 15, page 16 of 
the specification as filed. 

Based on the nucleotide sequence of ORF si 1 1945 from the genome database for Synechocystis 
sp. PCC 6803 [23], primers were designed to clone the putative dxps gene by polymerase chain 
reaction (PCR). The forward primer 5'-GTCCCAATCCACCATGCACATCAG-3' (SEQ ID 
NO: 7) overlaps the beginning of the coding sequence. The reverse primer 5'- 
CCCTCGACAAATGCAAAATGTATC-3' (SEQ ID NO: 8) lies outside the stop codon of the 
gene. A PCR (25 cycles) using Pfu DNA polymerase (Stratagene) produced a DNA fragment of 
the expected size (-1.9 kb). Subsequent sequencing of the fragment confirmed the product to be 
the ORF si 1 1945. The B. subtilis dxps gene was also cloned by PCR using primers designed to 
amplify the gene encoding the product YqiE, identified in the Bacillus subtilis genome database 
S3 [24]. The forward primer 5'-GATCCGCTATGGATCTTTTATC-3 ' (SEQ ID NO: 9) contains a 
ijS modified base substitution at the predicted start codon (underlined) for improved expression in E. 
% coli. The reverse primer 5'-ATCTAATCGTTCTTTCTTTGAC-3' (SEQ ID NO: 10) lies outside 
! 11 the stop codon of the dxps gene. After PCR (25 cycles) a DNA product of the expected size 
. p (-1.9 kb) was obtained, and when sequenced proved to be identical to the gene encoding the 

product YqiE. The PCR products from both reactions were treated with Taq DNA polymerase 
U (GibcoBRL) at 72°C for 10 min to synthesise blunt ended fragments. The fragments were then 
13 cloned into the EcoRV site of the pBluescript vector (Stratagene) using T4 DNA ligase 
□ (Fermentas) (Fig. 2). 

i: asv 

Substitute the following paragraph for the paragraph that begins on line 32, page 20 of 
the specification as filed. 

The amino acid sequence of the DXPS proteins of Synechocystis sp. 6803 (SEQ ID NO: 1) and 
B. subtilis (SEQ ID NO: 2) exhibited significant similarity to each other over their entire length 
(47% identities) and to E. coli DXPS (SEQ ID NO: 3) (B. subtilis (44% identities) and 
Synechocystis sp. 6803 (46% identities)) (Fig. 3). All three polypeptides share two conserved 
domains; one thought to be involved in thiamin binding [30] and a histidine residue postulated to 
participate in proton transfer [3 1], both of which are detailed in Fig. 3. The existence of thiamin- 



- 4 - 

binding domain in each of the polypeptides explains the cofactor requirement of thiamin for 
DXPS activity [12]. The high degree of polypeptide sequence identity, particularly the 
distribution of conserved domains, in all three indicates that they all encode DXPS or a closely 
related gene product. 

Substitute the following paragraph for the paragraph that begins on line 34, page 27 of 
the specification as filed. 

Forward : 5'-GCG CCG CTA TTT ACT CGA-3' (SEQ ID NO: 1 1) 

Substitute the following paragraph for the paragraph that begins on line 1, page 28 of the 
specification as filed. 

jjj Reverse : 5'-TTT CTC TGG CGT GCC GCC-3' (SEQ ID NO: 12) 

l"l 

■: 

ii 'j 

; q A marked up copy of the specification amendments is attached as Appendix A to 

facilitate the Examiner's review. 

! ; 

!' 

hj Please substitute the enclosed Sequence Listing for the presently filed Sequence Listing. 

: : ""3 

In the Claims 

Please amend the claims as follows: 

5. (amended) A method according to claim 3 wherein said vector comprises one or more 
nucleic acid sequences encoding a polypeptide(s) capable of producing a desired isoprenoid. 

6. (amended) A method according to claim 3 wherein said plant or plant cell is 
transformed with a further vector comprising one or more nucleic acid sequences encoding a 
polypeptide(s) capable of producing a desired isoprenoid. 



7. (amended) A method according to claim 3 wherein said vector comprising said 
nucleic acid sequence(s) encoding said DXPS and/or said polypeptide(s) capable of producing 
said isoprenoid further comprises a nucleic acid sequence of either a tissue specific promoter 
and/or encoding a plastid transit peptide. 

8. (amended) A method according to claim 5 wherein said desired isoprenoid is one 
conferring a nutritional benefit. 

16.(amended) A method according to claim 14, wherein said cell is any of a bacterial, 
yeast or algal cell. 

18. (amended) A method according to claim 14 wherein said organism is a plant. 

19. (amended) A method according to claim 14 wherein said cell is any of a bacterial, 
yeast or algal cell. 

20. (amended) A method according to claim 14 wherein said bacterial cell is E. colu 

27.(amended) A transgenic cell, tissue or organism according to claim 24, wherein said 
organism is a plant. 

29. (amended) Progeny of the organism according to claim 24 having increased 
isoprenoid activity. 

30. (amended) A transformed plant comprising a transgene capable of expressing DXPS 
from E. coli having the sequence according to SEQ ID NO: 3 and which plant comprises a higher 
level of isoprenoid than an untransformed plant. 



3 1 .(amended) A transformed plant according to claim 30 comprising any of constructs 
pVB6_TSEC_LML (SEQ ID NO: 6) or pVB6_35S„TSEC-LML (SEQ ID NO: 5). 
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32.(amended) A transformed plant according to claim 30 wherein said plant is a tomato 

plant. 

A marked up copy of the amended claims is attached as Appendix B to facilitate the 
Examiner's review. 



Applicants have amended the specification to provide priority application information 
and information regarding the publication in English under PCT Article 21(2) of the PCT 
application of which the above-identified application is a U.S. national stage application. The 
specification also has been amended to insert sequence identifiers. A substitute Sequence Listing 
is filed herewith. 

Applicants have amended the claims to eliminate certain dependencies and to add 
sequence identifiers. No new matter has been added. 



Remarks 



Respectfully submitted, 




Joipi R. Van Amsterdam 
Rfig. No. 40,212 
WOLF, GREENFIELD & SACKS, P.C. 



600 Atlantic Avenue 

Boston, Massachusetts 02210-221 1 

Tel: (617)720-3500 



Attorney ' s Docket No. B0 1 92/703 1 
Dated: I? July 2001 



- 7 - 



Appendix A 

Added Section 

Related Applications 

This application is a national stage filing under 35 U.S.C. § 371 of PCT International 
application PCT/GB00/00263, and filed 28 January 2000, which was published under PCT 
Article 21(2) in English. 

Foreign priority benefits are claimed under 35 U.S.C. § 1 19(a)-(d) of Great Britain 
application number GB 9901902.8, filed 28 January 1999. 

Amended Paragraphs of the Specification 

Substitute the following paragraph for the paragraph that begins on line 33 , page 13 of 
the specification as filed. 

Figure 3: is an illustration of an amino acid sequence alignment of DXP synthases used in 
the present invention, Synechocystis sp. 6803 (S.s) (GenBank D90903) (SEQ ID 
NO: a B. subtilis (B.s) (GenBank D84432) (SEQ ID NO: 2) and E. coli (E.c) 
(GenBank AF035440) (SEQ ID NO: 3) . The consensus line (consen) shows 
residues conserved in all three sequences (upper case letters) or residues which 
are identical in two sequences and replaced by an equivalent amino acid in the 
third sequence (+). The conserved histidine domain putatively involved in proton 
transfer is over lined and numbered 1. The second over lined domain (2) denotes 
the consensus thiamin pyrophosphate (TPP)-binding motif. 

Substitute the following paragraph for the paragraph that begins on line 35, page 14 of 
the specification as filed. 

Figure 6: is a diagrammatic illustration of vector pVB6_TSEC_LML (SEQ ID NO: 6) . 
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Substitute the following paragraph for the paragraph that begins on line 3, page 15 of the 
specification as filed. 

Figure 7: is a diagrammatic representation of plasmid pVB6_35S_TSECJLML fSEQ ID 
NO: 5) . 

Substitute the following paragraph for the paragraph that begins on line 6, page 15 of the 
specification as filed. 

Figure 8: is an illustration of the amino acid sequence of £ coli DXPS (SEP ID NO: 31 

Substitute the following paragraph for the paragraph that begins on line 9, page 15 of the 
specification as filed. 



}% Figure 9: is an illustration of the transit peptide used in tomato plants (SEP ID NO: 4) . 

i y 
j'fj 

1,3 Substitute the following paragraph for the paragraph that begins on line 1 5, page 1 6 of 

! : e the specification as filed. 



Based on the nucleotide sequence of ORF si 1 1945 from the genome database for Synechocystis 
sp. PCC 6803 [23], primers were designed to clone the putative dxps gene by polymerase chain 
reaction (PCR). The forward primer 5 f -GTCCC AATCC ACC ATGC ACATC AG-3 r (SEP ID 
NP: 7) overlaps the beginning of the coding sequence. The reverse primer 5- 
CCCTCGAC AAATGC AAA ATGTATC-3 ' (SEP ID NG: 8) lies outside the stop codon of the 
gene. A PCR (25 cycles) using Pfu DNA polymerase (Stratagene) produced a DNA fragment of 
the expected size (-1 .9 kb). Subsequent sequencing of the fragment confirmed the product to be 
the GRF si 1 1945. The B. subtilis dxps gene was also cloned by PCR using primers designed to 
amplify the gene encoding the product YqiE, identified in the Bacillus subtilis genome database 
[24]. The forward primer 5'-GATCCGCTATGGATCTTTTATC-3 ' (SEP IDNP: 9) contains a 
modified base substitution at the predicted start codon (underlined) for improved expression in E. 
coll The reverse primer 5'-ATCTAATCGTTCTTTCTTTGAC-3* (SEP ID NP: 10) lies outside 
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the stop codon of the dxps gene. After PCR (25 cycles) a DNA product of the expected size 
(-1.9 kb) was obtained, and when sequenced proved to be identical to the gene encoding the 
product YqiE. The PCR products from both reactions were treated with Taq DNA polymerase 
(GibcoBRL) at 72°C for 10 min to synthesise blunt ended fragments. The fragments were then 
cloned into the EcoRV site of the pBluescript vector (Stratagene) using T4 DNA ligase 
(Fermentas) (Fig. 2). 

Substitute the following paragraph for the paragraph that begins on line 32, page 20 of 
the specification as filed. 

The amino acid sequence of the DXPS proteins of Synechocystis sp. 6803 fSEQ ID NO: I) and 
B. subtilis (SEQ ID NO: 2) exhibited significant similarity to each other over their entire length 
;=3 (47% identities) and to E. coli DXPS fSEQ ID NO: 3) (B. subtilis (44% identities) and 
i;0 Synechocystis sp. 6803 (46% identities)) (Fig. 3). All three polypeptides share two conserved 
^ domains; one thought to be involved in thiamin binding [30] and a histidine residue postulated to 
j s »j participate in proton transfer [3 1], both of which are detailed in Fig. 3. The existence of thiamin- 
i.Q binding domain in each of the polypeptides explains the cofactor requirement of thiamin for 
L DXPS activity [12]. The high degree of polypeptide sequence identity, particularly the 
||* distribution of conserved domains, in all three indicates that they all encode DXPS or a closely 
G related gene product. 

I. Jin 

!' 

Substitute the following paragraph for the paragraph that begins on line 34, page 27 of 
the specification as filed. 

Forward : 5'-GCG CCG CTA TTT ACT CGA-3' (SEQ ID NO: in 

Substitute the following paragraph for the paragraph that begins on line 1, page 28 of the 
specification as filed. 



Reverse : S'-TTT CTC TGG CGT GCC GCC-3* (SEP ID NO: 12) 



-10- 



Appendix B 

5. A method according to claim 3 [or 4] wherein said vector comprises one or more 
nucleic acid sequences encoding a polypeptide(s) capable of producing a desired isoprenoid. 

6. A method according to claim 3 [or 4] wherein said plant or plant cell is 
transformed with a further vector comprising one or more nucleic acid sequences encoding a 
polypeptide(s) capable of producing a desired isoprenoid. 

7. A method according to [any of] claim[s] 3 [to 6] wherein said vector comprising 
said nucleic acid sequence(s) encoding said DXPS and/or said polypeptide(s) capable of 
producing said isoprenoid further comprises a nucleic acid sequence of either a tissue specific 
promoter and/or encoding a plastid transit peptide. 

8. A method according to [any of] claimfs] 5 [to 7] wherein said desired isoprenoid 
is one conferring a nutritional benefit. 

16. A method according to claim 14 [or 15], wherein said cell is any of a bacterial, 
yeast or algal cell. 

18. A method according to claim 14 [or 15] wherein said organism is a plant. 

19. A method according to [any of] claim[s] 14 [to 18] wherein said cell is any of a 
bacterial, yeast or algal cell. 

20. A method according to [any of] claim[s] 14 [to 19] wherein said bacterial cell is 

E. coli. 

27. A transgenic cell, tissue or organism according to [any of] claimfs] 24 [to 26], 
wherein said organism is a plant. 



29. Progeny of the organism according to [any of] claim[s] 24 [to 28] having 
increased isoprenoid activity. 



30. A transformed plant comprising a transgene capable of expressing DXPS from E. 
coli having the sequence according to [Figure 8] SEP ID NO: 3 and which plant comprises a 
higher level of isoprenoid than an untransformed plant. 

31. A transformed plant according to claim 30 comprising any of constructs 
pVB6_TSEC_LML (SEP ID NO: 6^1 or pVB6 J5SJTSEC-LML (SEP ID NP: 5) [illustrated in 
Figures 6 and 7], 

32. A transformed plant according to claim 30 [or 3 1] wherein said plant is a tomato 

plant. 



SEQUENCE LISTING 

<110> Bramley, Peter Michael 
Harker, Mark 

<12 0> Manipulating Isoprenoid Expression 

<130> B0192/7031 

<140> 09/890,229 
<141> 2000-01-28 

<150> GB 9901902.8 
<151> 1999-01-28 

<160> 12 

<170> Patentln version 3.0 

<210> 1 
<211> 640 
<212> PRT 

<213> Synechocystis sp. 
<400> 1 

Met His lie Ser Glu Leu Thr His Pro Asn Glu Leu Lys Gly Leu Ser 
15 10 15 

lie Arg Glu Leu Glu Glu Val Ser Arg Gin lie Arg Glu Lys His Leu 
20 25 30 

Gin Thr Val Ala Thr Ser Gly Gly His Leu Gly Pro Gly Leu Gly Val 
35 40 45 

Val Glu Leu Thr Val Ala Leu Tyr Ser Thr Leu Asp Leu Asp Lys Asp 
50 55 60 

Arg Val lie Trp Asp Val Gly His Gin Ala Tyr Pro His Lys Met Leu 
65 70 75 80 

Thr Gly Arg Tyr His Asp Phe His Thr Leu Arg Gin Lys Asp Gly Val 

85 90 95 

Ala Gly Tyr Leu Lys Arg Ser Glu Ser Arg Phe Asp His Phe Gly Ala 
100 105 110 

Gly His Ala Ser Thr Ser lie Ser Ala Gly Leu Gly Met Ala Leu Ala 
115 120 125 

Arg Asp Ala Lys Gly Glu Asp Phe Lys Val Val Ser lie lie Gly Asp 
130 135 140 

Gly Ala Leu Thr Gly Gly Met Ala Leu Glu Ala lie Asn His Ala Gly 
145 150 155 160 

His Leu Pro His Thr Arg Leu Met Val lie Leu Asn Asp Asn Glu Met 

165 170 175 

Ser lie Ser Pro Asn Val Gly Ala lie Ser Arg Tyr Leu Asn Lys Val 



2 



180 185 190 

Arg Leu Ser Ser Pro Met Gin Phe Leu Thr Asp Asn Leu Glu Glu Gin 
195 200 205 

lie Lys His Leu Pro Phe Val Gly Asp Ser Leu Thr Pro Glu Met Glu 
210 215 220 

Arg Val Lys Glu Gly Met Lys Arg Leu Val Val Pro Lys Val Gly Ala 

225 230 235 240 

Val lie Glu Glu Leu Gly Phe Lys Tyr Phe Gly Pro lie Asp Gly His 

245 250 255 

Ser Leu Gin Glu Leu lie Asp Thr Phe Lys Gin Ala Glu Lys Val Pro 
260 265 270 

Gly Pro Val Phe Val His Val Ser Thr Thr Lys Gly Lys Gly Tyr Asp 
275 280 285 

Leu Ala Glu Lys Asp Gin Val Gly Tyr His Ala Gin Ser Pro Phe Asn 
290 295 300 

Leu Ser Thr Gly Lys Ala Tyr Pro Ser Ser Lys Pro Lys Pro Pro Ser 
305 310 315 320 

Tyr Ser Lys Val Phe Ala His Thr Leu Thr Thr Leu Ala Lys Glu Asn 

325 330 335 

Pro Asn lie Val Gly lie Thr Ala Ala Met Ala Thr Gly Thr Gly Leu 
340 345 350 

Asp Lys Leu Gin Ala Lys Leu Pro Lys Gin Tyr Val Asp Val Gly lie 
355 360 365 

Ala Glu Gin His Ala Val Thr Leu Ala Ala Gly Met Ala Cys Glu Gly 
370 375 380 

lie Arg Pro Val Val Ala lie Tyr Ser Thr Phe Leu Gin Arg Gly Tyr 
385 390 395 400 

Asp Gin lie lie His Asp Val Cys lie Gin Lys Leu Pro Val Phe Phe 

405 410 415 

Cys Leu Asp Arg Ala Gly lie Val Gly Ala Asp Gly Pro Thr His Gin 
420 425 430 

Gly Met Tyr Asp lie Ala Tyr Leu Arg Cys lie Pro Asn Leu Val Leu 
435 440 445 

Met Ala Pro Lys Asp Glu Ala Glu Leu Gin Gin Met Leu Val Thr Gly 
450 455 460 

Val Asn Tyr Thr Gly Gly Ala lie Ala Met Arg Tyr Pro Arg Gly Asn 
465 470 475 480 

Gly lie Gly Val Pro Leu Met Glu Glu Gly Trp Glu Pro Leu Glu lie 

485 490 495 



Gly Lys Ala Glu lie Leu Arg Ser Gly Asp Asp Val Leu Leu Leu Gly 
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500 

Tyr Gly Ser Met 
515 

Glu His Gly lie 
530 

Leu Asp Thr Glu 
545 

Val Thr Met Glu 



Ala Glu Ala Leu 
580 

Gly Val Pro Asp 
595 

Val Asp Leu Gly 
610 

Ser Leu Phe Lys 
625 

<210> 2 
<211> 633 
<212> PRT 
<213> Bacillus 

<400> 2 

Met Asp Leu Leu 
1 

lie Asp Glu Leu 
20 

Thr Ser Leu Ser 
35 

Val Glu Leu Thr 
50 

Lys Phe Leu Trp 
65 

Thr Gly Arg Gly 



Cys Gly Phe Pro 
100 

Gly His Ser Ser 
115 



Val Tyr Pro Ala 
520 

Glu Ala Thr Val 
535 

Leu lie Leu Pro 
550 

Glu Gly Cys Leu 
565 

Met Asp Asn Asn 



lie Leu Val Asp 
600 

Leu Thr Pro Ala 
615 

Thr Glu Thr Glu 
630 



subtilis 



Ser lie Gin Asp 
5 

Glu Lys Leu Ser 



Ala Ser Gly Gly 
40 

Val Ala Leu His 
55 

Asp Val Gly His 
70 

Lys Glu Phe Ala 
85 

Lys Arg Ser Glu 



Thr Ser Leu Ser 
120 



505 

Leu Gin Thr Ala 



Val Asn Ala Arg 
540 

Leu Ala Glu Arg 
555 

Met Gly Gly Phe 
570 

Val Leu Val Pro 
585 

His Ala Thr Pro 



Gin Met Ala Gin 
620 

Ser Val Val Ala 
635 



Pro Ser Phe Leu 
10 

Asp Glu lie Arg 
25 

His lie Gly Pro 



Lys Glu Phe Asn 
60 

Gin Ser Tyr Val 
75 

Thr Leu Arg Gin 
90 

Ser Glu His Asp 
105 

Gly Ala Met Gly 



Glu Leu Leu His 
525 

Phe Val Lys Pro 



lie Gly Lys Val 
560 

Gly Ser Ala Val 
575 

Leu Lys Arg Leu 
590 

Glu Gin Ser Thr 
605 

Asn lie Met Ala 



Pro Gly Val Ser 
640 



Lys Asn Met Ser 
15 

Gin Phe Leu lie 
30 

Asn Leu Gly Val 
45 

Ser Pro Lys Asp 



His Lys Leu Leu 
80 

Tyr Lys Gly Leu 
95 

Val Trp Glu Thr 
110 

Met Ala Ala Ala 
125 



Arg Asp lie Lys Gly Thr Asp Glu Tyr lie lie Pro lie lie Gly Asp 
130 135 140 
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Gly Ala Leu Thr 
145 

Asp Glu Lys Lys 



lie Ala Pro Asn 
180 

Thr Ala Gly Lys 
195 

Lys Lys lie Pro 
210 

Val Lys Asp Ser 
225 

Glu Leu Gly Phe 



Glu Leu lie Glu 
260 

Leu Leu His Val 
275 

Thr Asp Thr lie 
290 

Thr Gly Asp Phe 
305 

Leu Val Ser Gly 



Val Ala lie Thr 
340 

Ala Lys Glu Phe 
355 

His Ala Ala Thr 
370 

Phe Leu Ala lie 
385 

Val His Asp lie 



Arg Ala Gly Leu 

420 

Asp lie Ala Phe 
435 

Lys Asp Glu Asn 
450 



Gly Gly Met Ala 
150 

Asp Met lie Val 
165 

Val Gly Ala lie 



Tyr Gin Trp Val 
200 

Ala Val Gly Gly 
215 

Leu Lys Tyr Met 
230 

Thr Tyr Leu Gly 
245 

Asn Leu Gin Tyr 



lie Thr Lys Lys 
280 

Gly Thr Trp His 
295 

Val Lys Pro Lys 
310 

Thr Val Gin Arg 
325 

Pro Ala Met Pro 



Pro Asp Arg Met 
360 

Met Ala Ala Ala 
375 

Tyr Ser Thr Phe 
390 

Cys Arg Gin Asn 
405 

Val Gly Ala Asp 



Met Arg His lie 
440 

Glu Gly Gin His 
455 



Leu Glu Ala Leu 
155 

lie Leu Asn Asp 
170 

His Ser Met Leu 
185 

Lys Asp Glu Leu 



Lys Leu Ala Ala 
220 

Leu Val Ser Gly 
235 

Pro Val Asp Gly 
250 

Ala Lys Lys Thr 
265 

Gly Lys Gly Tyr 



Gly Thr Gly Pro 
300 

Ala Ala Ala Pro 
315 

Met Ala Arg Glu 
330 

Val Gly Ser Lys 
345 

Phe Asp Val Gly 



Met Ala Met Gin 
380 

Leu Gin Arg Ala 
395 

Ala Asn Val Phe 
410 

Gly Glu Thr His 

425 

Pro Asn Met Val 



Met Val His Thr 
460 



Asn His lie Gly 
160 

Asn Glu Met Ser 
175 

Gly Arg Leu Arg 
190 

Glu Tyr Leu Phe 
205 

Thr Ala Glu Arg 



Met Phe Phe Glu 
240 

His Ser Tyr His 
255 

Lys Gly Pro Val 
270 

Lys Pro Ala Glu 
285 

Tyr Lys lie Asn 



Ser Trp Ser Gly 
320 

Asp Gly Arg lie 
335 

Leu Glu Gly Phe 
350 

lie Ala Glu Gin 
365 

Gly Met Lys Pro 



Tyr Asp Gin Val 
400 

lie Gly lie Asp 
415 

Gin Gly Val Phe 

430 

Leu Met Met Pro 
445 

Ala Leu Ser Tyr 
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^J^XSSLATI^Gl^^BSNO ID EXPRESSION 



The present invention is concerned with manipulating 
or altering isoprenoid expression in a cell or 
organism which biosynthesises isopentenyl diphosphate 
(IPP), which is the universal precursor of all 
isoprenoids in nature, via a mevalonate independent 
pathway. 

Isoprenoids constitute the largest class of natural 
products occurring in nature, with over 29,000 
individual compounds identified to date [1] . 
Chemically, they are extremely diverse in their 
structure and complexity. The fundamental biological 
functions performed by isoprenoids ensure they are 
essential for the normal growth and developmental 
processes in all living organisms. These include 
functioning as eukaryotic membrane stabilisers 
(sterols), plant hormones (gibberellins and abscisic 
acid) , providing pigments for photosynthesis 
(carotenoids and phytol side chain of chlorophyll), 
and as carriers for electron transport (menaquinone, 
plastoquinone and ubiquinone) . 

All isoprenoids are synthesised via a common metabolic 
precursor, isopentenyl diphosphate (IPP; C 5 ) . Until 
recently, the biosynthesis of IPP was generally 
assumed to proceed exclusively from acetyl-CoA via the 
classical mevalonate pathway (Fig. 1) [2] . The enzyme 
3-hydroxy-3-methylglutaryl coenzyme A reductase {HMGR) 
catalyses the conversion of hydroxymethylgluraryl-CoA 
to mevalonate, a key reaction of the 

mevalonate-dependant IPP biosynthetic pathway. Recent 
studies have demonstrated that mevalonate is not the 
biosynthetic precursor of IPP in all living organisms 
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[3,4]. The existence of an alternative, 
mevalonate-independent pathway for IPP formation was 
identified initially in several species of eubacteria 

[4,5] and a green alga [6]. The pathway was 
subsequently shown to be operational in the plastids 
of higher plants [7-10] . The first reaction in the 
non-mevalonate pathway is the transketolase-type 
condensation reaction of. pyruvate and 
D-glyceraldehyde-3-phosphate to yield 
l-deoxy-D-xylulose-5-phosphate (DXP) (Fig. 1) . This 
reaction is catalysed by the enzyme 1-deoxy-D- 
xylulose-5-phosphate synthase. The second reaction in 
the pathway is the conversion of DXP to 2-C-methyl-D- 
erythritol-4-phosphate (MEP) . The reactions which 
convert MEP to IPP have yet to be characterised. 

The cloning and characterisation of the DXP synthase 
(dxps) gene has been described for a number of 
organisms including Esherichia coli [11,12] and higher 
plants [13-15] . The CLA1 gene product from 
Arabidopsis thaliana associated with chloroplast 
development [16], for example, has been shown to 
exhibit DXPS activity [11]. Recently, a gene 
responsible for the reduction of DXP to 
2-C-methyl-D-erythritol-4-phosphate, the proposed 
next step in the non-mevalonate pathway has been 
cloned from E. coli [17] . 

The present inventors have surprisingly found that the 
first reaction in the mevalonate-independent IPP 
biosynthetic pathway is highly influential in 
controlling the levels of isoprenoids which can be 
formed in a cell or organism within which the 
mevalonate independent IPP biosynthetic pathway is 
present. The enzyme DXPS or functional equivalents 
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thereof, has been identified by the present inventors 
as a rate-limiting step in isoprenoid biosynthesis and 
that DXPS activity plays an important role in 
channelling the carbon resources of the cell into the 
5 isoprenoid biosynthetic pathway. 

Therefore, according to a first aspect of the present 
invention there is provided a method of manipulating 
isoprenoid expression in a cell possessing a 
10 mevalonate independent isopentenyl diphosphate 
O synthesising pathway, which method comprises altering 

^ the activity of the enzyme l-deoxy-D-xylulose-5- 

/g phosphate synthase (DXPS), or a functional equivalent 

thereof. Thus, advantageously, the rate-limiting 
m 15 effect conferred by DXPS activity on the IPP 
! :0 biosynthetic pathway can be utilised to manipulate the 

resultant levels of isoprenoids in a cell by altering 
the activity or expression of DXPS. 

20 Preferably, the levels of isoprenoids in a cell can be 
enhanced by increasing the activity or expression of 
the DXPS. Likewise reduced levels of isoprenoids can 
be achieved by reducing or inhibiting activity or 
expression of DXPS in a cell or organism. Increasing 

25 the DXPS activity may be achieved by, for example, 
transforming the cell which may itself be part of a 
cell line or an organism, with an expression vector 
comprising a nucleic acid molecule encoding DXPS, 
which may advantageously be operably linked to a 

30 reporter molecule, such as used in the GUS assay which 
is known in the art. Preferably, the vector comprises 
any of the vectors designated pBSDXPS or pSYDXPS, 
illustrated in Figure 2. 



35 
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An alternative method for altering expression may 
comprise utilising a technique known as Enforced 
Evolution, or DNA Shuffling see Patten et al. Current 
Opinion in Biotechnology, 1997, Vol, 8, No. 6, pp 724- 
733, Crameri et al., Nature 1998, Vol. 391, No. 6664, 
pp 288-291 and Harayama S, Trends in Biotechnology, 
1998, Vol. 16, No. 2, pp 76-82. According to this 
method improvements in enzyme activity can be achieved 
by reassembling DNA segments into a full length gene 
by homologous or site specific combination. Before 
the assembly, the segments are often subjected to 
random mutagenesis by error prone PCR, random 
nucleotide insertion, or other such methods. The 
genes can be expressed in suitable microbial hosts 
leading to the production of functional polypeptides, 
such as DXPS. 



The nucleic acid encoding the DXPS may be endogenous 
to the cell or organism into which it will be 
transformed or, alternatively, may be exogenous. In 
one embodiment of the invention, the method may also 
comprise transforming the cell or organism with a 
vector comprising one or more nucleic acid sequences 
suitable for producing a desired isoprenoid. This 
aspect of the invention is particularly advantageous 
because it allows isoprenoids to be produced in a 
cell or organism independent of the source of the 
isoprenoid which may be derived from cells or 
organisms which do not possess the mevalonate 
independent IPP biosynthesising pathway. Similarly, 
enhanced levels of an isoprenoid can be produced in 
cells or organisms having the mevalonate independent 
IPP biosynchetic pathway. 



35 
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Therefore, in the example where the cell is E. coli it 
is possible to engineer production of an isoprenoid 
which is exogenous to the E. coli bacterium, which 
isoprenoid may be, for example, any of the carotenoids 
5 of plants, such as, lycopene or even an isoprenoid of 
human origin. 

Carotenoids are yellow-orange-red lipid based pigments 
found in nature. They have been found to be useful in 

10 a variety of applications, for example, as 

supplements, and particularly vitamin supplements, as 
vegetable oil based food products and food 
ingredients, as feed additives in animal feeds and as 
colorants. Phytoene has been found to be useful in 

15 treating skin disorders whilst lycopene and a and jS 
carotene consumption have been implicated as having 
preventative effects against certain kinds of cancers. 
Therefore, it is a highly advantageous aspect of the 
invention that increased production of such compounds 

20 can be achieved and which compounds can confer 

considerable health care benefits. Once the desired 
carotenoid or other isoprenoid has been produced in E. 
coli, or other suitable organism as defined above, it 
can be isolated using standard bicengineering 

25 techniques. 

Increases in concentrations of any desired isoprenoid 
may be achieved, in a cell or alternatively an 
organism which possesses the IPP biosynthetic 

30 mevalonate independent pathway. For example, crops 

can be engineered using the method of the invention to 
produce increased levels of an isoprenoid which 
confers nutritional benefits to humans following 
consumption of the plant, such as, for example, 

35 vitamin E and lycopene. 
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Therefore, there is also provided by a further aspect 
of the invention a cell or organism having a 
mevalonate independent IPP biosynthetic pathway and 
which has been transformed or transfected with an 
5 expression vector comprising a nucleic acid molecule 
encoding DXPS or a functional equivalent or 
bioprecursor thereof. As described above, the vector 
may also include one or more further nucleic acid 
sequences suitable for producing a desired isoprenoid, 
10 or alternatively the one or more nucleic acid 

sequences may be included in a separate vector, 
operably linked to suitable expression control 
sequences. In a particularly preferred embodiment the 
cell or organism comprises a plant. 

15 

An expression vector according to the invention 
includes a vector having a nucleic acid sequence 
operably linked to regulatory sequences, such as 
promoter regions, that are capable of effecting 

20 expression of said DNA fragments. The term "operably 
linked" refers to a juxta position wherein the 
components described are in a relationship permitting 
them to function in their intended manner. Such 
vectors may be transformed into a suitable host cell 

25 or organism to produce a desired protein, such as DXPS 
or an isoprenoid according to the method of the 
invention. Thus, in a further aspect, the invention 
provides a process for producing a desired isoprenoid 
which comprises cultivating a host cell, transformed 

30 or transfected with an expression vector as described 
above under conditions to provide for expression by 
the vector of DXPS or a functional equivalent thereof 
or suitable polypeptides for producing a desired 
isoprenoid and optionally recovering the expressed 

35 polypeptides. 
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The vectors may be, for example, plasmid, virus or 
phage vectors provided with an origin of replication, 
optionally a promoter for the expression of said 
nucleotide and optionally a regulator of the promoter. 
5 The vectors may contain one or more selectable 

markers, such as, for example, ampicillin resistance. 

Regulatory elements required for expression include 
promoter sequences to bind RNA polymerase and 

10 transcription initiation sequences for ribosome 

binding. For example, a bacterial expression vector 
may include a promoter such as the lac promoter and 
for transcription initiation in the Shine-Dalgarno 
sequence and the start codon AUG. Similarly, a 

15 eukaryotic expression vector may include a 

heterologous or homologous promoter for RNA polymerase 
II, a downstream polyadenylation signal, the start 
codon AUG, and a termination codon for detachment of 
the ribosome. Such vectors may be obtained 

20 commercially or assembled from the sequences described 
by methods well known in the art. 

By combining rhe nucleic acid sequences encoding said 
DXPS and optionally the one or more sequences suitable 

25 for producing an isoprenoid with tissue specific 

promoters, enhanced isoprenoid levels in specified 
tissues of plants can be achieved. For example, by 
utilising a seed specific promoter or other 
transcriptional initiation region, elevated levels of 

30 carotenoids in seeds may be achieved. The seed can 
then be harvested and which provides a reservoir for 
the isoprenoid or carotenoid of interest. 



35 



Generally, the nucleic acid molecule encoding said 
DXPS which is included in the vector used in 
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accordance with the method of the invention, will be 
transformed into a plant cell so that the DXPS 
molecule is directed to the plastids of the plant. 
Accordingly, where the vector is not inserted directly 
5 into the piastid of the plant, the vector will further 
comprise a nucleic acid sequence operably linked to 
said DXPS or said one or more isoprenoid producing 
nucleic acid sequences and which further sequence will 
encode a transit peptide to direct expression of the 
10 DXPS or isoprenoid into the piastid. Native or 

heterologous transit peptides may be utilised in this 
embodiment of the invention. 

As aforesaid, the mevalonate independent IPP 
15 biosynthetic pathway is not present in any higher 
animals, particularly humans. Therefore, the 
inhibition of the reaction catalysed by DXPS provides 
a unique targer site to selectively inhibit or 
alleviate bacterial associated infections by altering 
20 the expression level of or inhibiting function or 
activity of DXPS. 

One method of inhibiting or preventing expression of 
DXPS utilises antisense technology. Antisense 

25 technology can be used to control gene expression 

through helix formation of antisense DNA or RNA, both 
of which methods are based on polynucleotide binding 
to DNA or RNA. For example, the 5' -coding region of a 
native DNA sequence coding for DXPS according to the 

30 invention may be used to design an antisense RNA 

nucleotide of from 10 to 50 base pairs in length. A 
DNA oligonucleotide is designed to be complementary to 
a region of -he gene involved in transcription 
(triple-helix - see Lee et al, Nucl. Acids. Res., 

35 6:3073 (1973;; Cooney et al., Science, 241:456 (1988); 
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and Derman et al., Science 251:1360 (1991), in which 
case expression of the antisense RNA oligonucleotide 
allows hybridisation to the rnRNA in vivo and blocks 
translation of an mRNA molecule into DXPS. 

Alternatively, compounds can be screened for their 
ability to inhibit the catalytic activity or 
expression of DXPS in the mevalonate - independent IPP 
biosynthetic pathway. According to a further aspect 
of the invention, therefore, there is also provided a 
method of identifying a compound which modulates 
isoprenoid production or expression which method 
comprises contacting said compound to be tested with a 
molecule from the mevalonate independent IPP 
biosynthetic pathway and which molecule undergoes a 
reaction in the presence of an appropriate reactant 
catalysed by DXPS, in the presence of DXPS and 
monitoring the level of product produced when compared 
to the same reaction in the absence of the compound to 
be tested. Preferably, the molecules which are 
reacted are pyruvate and glyceraldehyde-3-phosphate, 
and which undergo a condensation reaction in the 
presence of DXPS, to yield l-deoxy-D-xylulose-5- 
phosphate (DXP) as illustrated in Figure 1. 

Any compounds identified as preventing expression or 
activity of the DXPS enzyme according to the invention 
may advantageously be particularly useful as selective 
toxicity agents to destroy, for example, bacterial or 
plant cells which possess the mevalonate independent 
IPP biosynthetic pathway. These compounds therefore 
can be particularly useful as medicaments or 
herbicides, or alternatively in the preparation of a 
medicament for treating bacterial associated diseases. 



* r 
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A further aspect of the invention therefore also 
comprises a pharmaceutical composition comprising a 
compound identified as an inhibitor of expression or 
activity of DXPS or a functional equivalent or 
5 bioprecursor thereof, together with a pharmaceutically 
acceptable carrier, diluent or excipient thereof* 
Also provided by the invention is a herbicidal 
composition comprising said compound identified aus an 
inhibitor of expression or activity of DXPS function. 

10 

An even further aspect of the invention comprises a 
transgenic cell, tissue or organism having a 
mevalonate independent IPP biosynthetic pathway, which 
comprises a transgene capable of expressing at least 

15 one additional DXPS molecule according to the 

invention. The transgenic cell, tissue or organism 
may also comprise a transgene having one or more 
nucleic acid sequences capable of producing a desired 
isoprenoid. Preferably, the transgenic cell comprises 

20 a plant and even more preferably tomato plants - 

The term xv transgene capable of expression" as used 
herein means a suitable nucleic acid sequence (s) which 
leads to expression of DXPS or proteins having the 

25 same function and/or activity and/or encoding proteins 
capable of producing a desired isoprenoid. The 
transgene, may include, for example, isolated genomic 
nucleic acid or synthetic nucleic acid, including DNA 
integrated into the genome. Preferably, the transgene 

30 comprises the nucleic acid sequence (s) encoding the 

DXPS enzyme or said isoprenoid as described herein, or 
a functional fragment of said nucleic acid. A 
functional fragment of said nucleic acid should be 
taken to mean a fragment of the gene comprising said 

35 nucleic acid(s) coding for the DXPS enzyme or said 
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isoprenoid or a functional equivalent, derivative or a 
non-functional derivative such as a dominant negative 
mutant, or bioprecursor thereof. For example, it 
would be readily apparent to persons skilled in the 
art that nucleotide substitutions or deletions may be 
made using routine techniques, which do not affect the 
protein sequence and subsequent functioning of the 
DXPS enzyme and/or isoprenoid producing proteins^ 
encoded by said nucleic acid(s) . 

The DXPS enzyme expressed or the isoprenoid produced 
by said transgenic cell, tissue or organism or a 
functional equivalent or bioprecursor of said protein 
also forms part of the present invention. 

The recombinant DNA molecules or vectors of the 
invention can be introduced into a plant cell in a 
number of recognised ways in the art and it will be 
appreciated that the choice of method used might 
depend on the type of plant, i.e. monocot or dicot, 
targeted for transformation. Suitable methods of 
transforming plant cells include microinjection 
(Crossway et al. (1986) BioTechnieques 4:320-334), 
electroporation (Riggs et al. (1986) Proc. Natl. Acad. 
Sci. USA 83:5602-5606, Agrobacterium mediated 
transformation (Hinchee et al. (1988) Biotechnology 
6:915-921) and ballistic particle acceleration (see, 
for example, Sanford et al., U.S. Patent 4,945,050; 
and McCabe et al. (1988) Biotechnology 6:923-926). 

Alternatively, in the case of an organism, such as a 
plant, a plastid can be transformed directly. Stable 
transformation of chloroplasts has been reported in 
higher plants, see, for example, SVAB et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:8526-8530; SVAB & Maliga 
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(1993) Prcc. Natl. Acad. Sci . USA 90:913-917; Staub & 
Maliga (1993) Embo J. 12:601-606. The method relies 
on particle gun delivery of DNA containing a 
selectable marker and targeting of the DNA to the 
5 plastid genome through homologous recombination. In 
such methods, plastid gene expression can be 
accomplished by use of a plastid gene promoter or by 
trans-activation of a silent plastid-borne transgene 
positioned for expression from a selective promoter 

10 sequence such as that recognised by T7 RNA polymerase 
The silent plastid gene is activated by expression of 
the specific RNA polymerase from a nuclear expression 
construct and targeting of the polymerase to the 
plastid by use of a transit peptide. Tissue-specific 

15 expression may be obtained in such a method by use of 
a nuclear-er.coded and plastid-directed specific RNA 
polymerase expressed from a suitable plant tissue 
specific prcmoter. Such a system has been reported i 
McBride et s2. (1994) Proc. Natl. Acad. Sci., USA 

20 91:7301-73C5. 

The cells which have been transformed may be grown 
into plants in accordance with conventional methods 
known in the art. See, for example, McCormick et al. 

25 Plant Cell Reports (1986), 5:81-84. These plants may 
then be grown, and either pollinated with the same 
transformed strainer or different: strains, and the 
resulting hybrid having the desired phenotypic 
characteristic identified. Two or more generations 

30 may be grown to ensure that the subject phenotypic 

characteristic is stably maintained and inherited and 
then seeds harvested to ensure the desired phenotype 
or other prcperty has been achieved. 

35 A host cell of any plant variety may be employed. 
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Plant species which provide seeds of interest are 
particularly useful. For the most part, plants will 
be chosen where the seed is produced in high amounts, 
a seed-specific product of interest is involved, or 
the seed or a seed part is edible. Seeds of interest 
include the oil seeds, such as oilseed Brassica seeds, 
cotton seeds, soybean, safflower, sunflower, coconut, 
palm, and the like; grain seeds, e.g. wheat, barley, 
oats amaranth, flax, rye, triticale, rice, corn, etc.; 
other edible seeds or seeds with edible parts 
including pumpkin, squash, sesame, poppy, grape, mung 
beans, peanut peas, beans, radish, alfalfa, cocoa, 
coffee, tree nuts such as walnuts, almonds, pecans, 
chick-peas etc. 



The invention may be more clearly understood from the 
following exemplary embodiment described with 
reference to the accompanying drawings wherein: 



15 



20 



Figure 1: 



is an illustration of the 
mevalonate-dependant (A) and 
independent (B) pathways for IPP 
biosynthesis. Proposed reactions for 
the biosynthesis of 

l-deoxy-D-xylulose-5-phosphate from 
pyruvate and 

glyceraldehyde-3 -phosphate, catalysed 
by DXPS is as shown inside the box. 



25 



30 



Figure 2 : 



is an illustration of structure of 
plasmids pBSDXPS and pSYDXPS. 



Figure 3: 



is an illustration of an amino acid 
sequence alignment of DXP synthases 
used in the present invention, 
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Synechocystis sp. 6803 (S.s) (GenBank 
D90903), B. subtilis (B.s) (GenBank 
D84432) and E. coli (E.c) (GenBank 
AF035440) * The consensus line (consen) 
5 shows residues conserved in all three 

sequences (upper case letters) or 
residues which are identical in two 
sequences and replaced by an equi-valent 
amino acid in the third sequence ( + ) . 

10 The conserved histidine domain 

putatively involved in proton transfer 
is over lined and numbered 1. The 
second over lined domain (2) denotes 
the consensus thiamin pyrophosphate 

15 (TPP) -binding motif. 

Figure 4: is a graphic representation of lycopene 

accumulation in recombinant E. coli 
cultures expressing vector only (□) , 
20 B. subtilis DXPS (•) and Synechocystis 

sp. 6803 DXPS (a) . (Data are means + 
S.E.M. from three independent 
determinations . ) 

25 Figure 5: is an illustration of lycopene (open 

columns) and UQ-8 (shaded columns) 
content of E. coli control cultures 
(vector only) or expressing exogenous 
B. subtilis dxps (B. subtilis), 
Synechocystis sp. 6803 dxps (sp. 6803) 
or A. thaliana hmgrl (HMGR1) genes. 
(Data are means ( S.E.M. from three 
independent determinations.) 



30 



35 Figure 6: 



is a diagrammatic illustration of 
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vector pVB6JTSEC_LML. 

Figure 7: is a diagrammatic representation of 

plasmid pVB6_35S_TSEC-LML. 

5 

Figure 8: is an illustration of the amino acid 

sequence of E.coli DXPS, 

Figure 9: is an illustration of the transit 

10 peptide used in tomato plants. 



EXAMPLE 1 
15 Materials and methods 

Bacterial szrains, plasmids, and culture conditions. 

E. coli strain XLl-Blue (Stratagene) was used for gene 

20 cloning and expression of plasmids. E. coli was grown 
in Luria Brczh media [18] at 37°C on a rotary shaker 
at 250 rpm (unless otherwise stated) . Ampicillin (100 
jug/ml) , chloramphenicol (50 jug/ml) and 1.0 mM 
isopropyl-b--D-thiogalactoside (IPTG) (all purchased 

25 from Sigma) were added as required. Plasmid 

pBluescript (Stratagene) was used as a vector for both 
cloning and expression studies. Synechocystis sp. PCC 
6803 was obtained from the Institute Pasteur (Paris) 
and grown in BG11 medium [19] supplemented with 0.5% 

30 glucose at 30°C and 2,000 lux. Bacillus subtilis 

strain PY7 9 DNA was a kind gift from P. Wakeley (Royal 
Holloway, University of London) ♦ The construction of 
plasmid pACCRT-EIB, which expresses the E. uredovora 
crtE, crtB and crtl genes necessary for lycopene 

35 biosynthesis in E. coli cells into which it has been 
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introduced, has been described previously [20] . The 
plasmid used for the expression of HMGR1 cloned into 
pBluescript (pHMGRl) has also been described elsewhere 
[21] . 

5 

Recombinant DNA techniques 

All recombinant DNA techniques were performed by- 
standard methods [22] or according to suppliers 
10 instructions. Genomic DNA was extracted from all 
organisms using the Qiagen Genomic-tip 20/G kit. 

Cloning of dxps genes 

15 Based on the nucleotide sequence of ORF slll945 from 
the genome database for Synechocystis sp. PCC 68 03 
[23] r primers were designed to clone the putative 
dxps gene by polymerase chain reaction (PCR) . The 
forward primer 5' -GTCCCAATCCACCATGCACATCAG-3' overlaps 

20 the beginning of the coding sequence. The reverse 

primer 5' -CCCTCGACAAATGCAAAATGTATC-3' lies outside the 
stop codon of the gene. A PCR (25 cycles) using Pfu 
DNA polymerase (Stratagene) produced a DNA fragment of 
the expected size (-1.9 kb) . Subsequent sequencing of 

25 the fragment confirmed the product to be the ORF 

slll945. The B. subtilis dxps gene was also cloned by 
PCR using primers designed to amplify the gene 
encoding the product YqiE, identified in the Bacillus 
subtilis genome database [24] . The forward primer 

30 5 f -GATCCGCTATGGATCTT TTATC-3' contains a modified base 
substitution at the predicted start codon (underlined) 
for improved expression in E. coli. The reverse 
primer 5' -ATCTAATCGTTCTTTCTTTGAC-3 ' lies outside zhe 
stop codon of the dxps gene. After PCR (25 cycles) a 

35 DNA product of the expected size (-1.9 kb) was 
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obtained, and when sequenced proved to be identical 
to the gene encoding the product YqiE. The PCR 
products from both reactions were treated with Taq DNA 
polymerase (GibcoBRL) at 72°C for 10 min to synthesise 
5 blunt ended fragments. The fragments were then cloned 
into the EcoRV site of the pBluescript vector 
(Stratagene) using T4 DNA ligase (Fermentas) (Fig. 2) . 

In vitro DXP synthase assay 

10 

E. coli XLl-blue cells, transformed with the 
appropriate plasmid, were grown at 37 °C in Luria 
Broth medium with appropriate antibiotics to an OD 620 nm 
of 0.6, and induced by the addition of 1.0 mM IPTG at 

15 28°C for two hours. Bacteria were harvested by 
centrifugation (6,000g for 10 min) and washed in 
buffer A (100 mM Tris-HCl (pH 7.5), 1 mM 
dithioreitoi, 0.3 M sucrose). Cells were resuspended 
to their original volume in buffer B (100 mM Tris <pH 

20 8.0), 1 mM dithioreitoi, 0.1 mM 

phenylmethanesulphonyl fluoride, pepstatin, 
ljug/ml leupeptin, 1 mg/ml lysozyme) . The cells were 
then incubated at 30°C for 15 min with gentle 
agitation, and then disrupted by brief sonication at 

25 4°C. The supernatant was recovered and the protein 
concentration determined using the Bradford assay 
[25] . 

An aliquot of the supernatant (100 jllI) was transferred 
30 to an Eppendorf tube along with 100 jlcI of assay buffer 
containing 100 mM Tris (pH 8.0), 3 mM ATP, 3 mM 
Mn 2r , 3 mM Mg 2 ~, 1 mM KF, 1 mM thiamine diphosphate, 
(0.1%) Tween 60, 0.6 Mm 

mDL-glyceraldehyde-3-phosphate, 30 /j,M [2- :tl C] pyruvate 
35 (0.5 AtCi) . The mixture was incubated for 2 hours at 
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30°C with gentle agitation. The reaction was stopped 
by heating the mixture at 80°C for 3 min. After 
centrifugation at 13,000 g for 5 min, the supernatant 
was transferred to a clean tube and evaporated to 
5 dryness. The residue was resuspended in methanol (50 
and loaded onto a TLC plate (silica gel 60) . 
Chromatograms were developed in n-propyl alcohol/ethyl 
acetate/H 2 0 (6:1:3 v/v/v) . 

10 Enzyme assays were performed with extracts of induced 
cells expressing either Synechocystis sp. PCC 6803 or 
B. subtilis DXPS, as opposed to control assays in 
which cells contained only the pBluescript vector 
without insert. TLC analysis of assays expressing one 

15 of the dxps clones exhibited a major band (R f 0.14 ) 
assumed to be DXP which was not observed in the 
controls. Quantification of 14 C-labelled DXP was 
achieved by isolation of the reaction product on TLC. 
The DXP band was scraped off the plate, eluted from 

20 the silica using methanol and quantified by 
liquid-scintillation counting. Enzymatic 
dephosphorylation of the assay products resulted in 
the formation of 1-deoxy-D-xylulose (DX) , when 
analysed on TLC (R f 0.50). When non-radioactive 

25 pyruvate was used in the assay, the DXP (Rf 0.12 
stained purple) and DX (R f 0.47 stained blue) were 
identified by staining with p-anisaldehyde/sulphuric 
acid (3:1). The DXP co-chromatographed with 
authentic, chemically synthesised DXP which stained 

30 purple also. The reaction substrates pyruvate (R f 
0.36 stained yellow), DL-glyceraldehyde-3-phosphate 
(R f 0.15 stained orange) and D-glyceraldehyde (R f 0.74 
stained orange) were also observable using this TLC 
system. In reactions where the assay products were 

35 dephosphorylated no DXP was observed on TLC only DX. 
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Quantification of lycopene and ubiquinone QB-8 in E. 
call 

Bacterial growth was determined from the OD 620 nm - Dr Y 
5 cell weight was calculated from known volumes of 

culture harvested by centrif ugation at 13,000 g for 5 
min, washed once with water and recentrif uged. The 
cells were lyophilised overnight and the weight -ef the 
dried cell pellet determined. The lycopene content of 

10 the cells was determined by harvesting aliquots of E. 

coli cells by centrif ugation at 13,000 g for 5 min and 
washing once in water followed by recentrif uging. The 
cells were resuspended in acetone (200 jxl) and 
incubated at 68°C for 5 min in the dark. The samples 

15 were centrif uged again 13,000 g for 10 min and the 

acetone supernatant containing the lycopene was placed 
in a clean tube. The extract was evaporated to 
dryness under a stream of N2 and stored at -20°C in 
the dark. The lycopene content of the extracts was 

20 determined by visible light absorption spectra using a 
Beckman DU Series 7000 diode array spectrometer. 
Spectra were recorded in acetone using an A 1% lcm of 3450 
[26] . 

25 UQ-8 was extracted from cells based on the methods of 
Yoshida et al. [27]. Cells were collected by 
centrifugation, washed once with water and then 
lyophilised overnight. The dried pellet was extracted 
in n-propanol (3 ml) and of n-hexane (5 ml) containing 

30 15 /uq of UQ-10 as an internal standard, by disruption 
of the cells using a pestle and mortar. The solvent 
phase and that obtained by the second extraction from 
the aqueous phase n-hexane (3 ml) were combined and 
evaporated to dryness under N 2 . The residue was 

35 resuspended in ethanol and analysed by reversed phase 



WO 00/44912 



PCT/GBOO/00263 



- 20 - 



HPLC as described previously [28], Peaks were 
identified by comparing their elution profiles with 
standards for UQ-7, UQ-9 and UQ-10. A standard of 
UQ-8 was not available, and the UQ-8 peak was 
5 identified by its elution profile relative to those of 
the other standards [29] . 

Cloning of the dxps genes 

10 The cloning of dxps and the characterisation of the 
gene product, DXPS, from E. coli has recently been 
reported by two research groups [11,12] . The gene 
product was shown to exhibit DXP synthase activity, 
which is considered as the first reaction of the 

15 mevalonate-independent pathway for IPP biosynthesis 
(Fig. 1) [5] . Based on the E. coli dxps nucleotide 
sequence homologs of the gene were identified in the 
eubacterial genomes of B. subtilis and Synechocystis 
sp. PCC 6803. The open reading frame slll945 in the 

20 Synechocystis sp. 6803 genome was cloned by PCR, 

ligated into the vector pBluescript, and designated 
pSYDXPS (Fig. 2) . The gene extends over 1920 bp and 
contains an open reading frame encoding a polypeptide 
of 640 amino acids, with a predicted molecular mass of 

25 69 kDa. The dxps homolog in the B. subtilis genome was 
identified as the ORF encoding the product YqiE. It 
was cloned by PCR, and introduced into pBluescript to 
generate plasmid pBSDXPS (Fig. 2). The gene extends 
over 1899 bp and encodes a polypeptide of 633 amino 

30 acids with a predicted molecular mass of 70 kDa. 

The amino acid sequence of the DXPS proteins of 
Synechocystis sp. 6803 and B. subtilis exhibited 
significant similarity to each other over their entire 
35 length (47% identities) and to the E. coli DXPS (B. 
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subtilis (44 % identities) and Synechocystis sp. 6803 
{46 % identities}) (Fig. 3), All three polypeptides 
share two conserved domains; one thought to be 
involved in thiamin binding [30] and a histidine 
5 residue postulated to participate in proton transfer 
[31], both of which are detailed in Fig. 3. The 
existence of a thiamin-binding domain in each of the 
polypeptides explains the cofactor requirement o€ 
thiamin for DXPS activity [12] . The high degree of 
10 polypeptide sequence identity, particularly the 
distribution of conserved domains, in all three 
indicates that they all encode DXPS or a closely 
related gene product. 

15 Quantification of lycopene and UQ-8 in £. coli 
trans formants 

Cells of E. coli transformed with pACCRT-EIB [20] are 
pigmented pink due to the accumulation of lycopene. 

20 E. coli cells engineered to produce lycopene, were 

transformed with either pBSDXPS, pSYDXPS, pHMGR, or 
pBluescript to act as a control, to monitor the 
effect on lycopene biosynthesis when exogenous DXPS 
was expressed in the cells. The E. coli were grown in 

25 50 ml cultures at 30°C with induction by IPTG for 48 
hours, by which time they had reached the stationary 
phase of growth. Figure 4 shows the accumulation of 
lycopene in the cultures during the 4 8 hour culture 
period. The graph clearly demonstrates that the E. 

30 coli cultures expressing exogenous dxps accumulated 
lycopene at a much greater rate than the control 
culture. The final lycopene content of the 
recombinant dxps strains was approximately double that 
of the control (Fig. 5) ♦ A similar increase was also 

35 obtained in E. coli cells engineered to produce the 
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this organism. 

In vitro enzyme activity 

5 The increased levels of carotenoids and UQ-8 in E. 

coli expressing exogenous DXPS were hypothesised to be 
due to increased DXPS enzymatic activity in the cells. 
This was confirmed by preparing cell homogenates - * f rom 
recombinant E. coli strains after induction with IPTG. 

10 Reaction products were measured over a two hour 
period, separated by TLC and quantified by 
liquid-scintillation counting. The major product 
obtained from the reaction co-chromatographed with 
chemically-synthesised DXP. This confirms DXP as the 

15 major reaction product in the assay. The putative 

DXPS function of B. subtilis ORF encoding the product 
YqiE and Synechocystis sp. 6803 ORF S111945 has been 
established by these results- Table 1 shows the 
specific activity of DXPS in the recombinant E. coli 

20 strains. The results show that DXPS activity was 

increased in E. coli expressing endogenous dxps genes. 
This increase was greatest in homogenates containing 
the B. subtilis DXPS, where a 2.0 fold increase was 
observed compared to the controls. Homogenates 

25 containing the Synechocystis sp. 6803 DXPS exhibited a 
1.8 fold increase compared to control reactions. 
Therefore, increased DXPS activity in E . coli 
appears to be responsible for the increased levels of 
carotenoids and UQ-8 observed in the transgenic 

30 strains. The relative increases in carotenoid levels 
between E. coli cultures expressing plasmids pSYNDXSP 
and pBSDXPS closely resemble the increases observed in 
the in vitro studies. This suggests that there is a 
direct relationship between DXPS activity and the 

35 carotenoid content of the cells. This is not the case 
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colourless carotenoid phytoene {data not shown) . 
Alterations in the endogenous levels of isoprenoids 
were determined by measuring the ubiquinone content of 
the cells. In E. coli, the major quinones 
5 encountered are ubiquinone (UQ-8) and menaquinone 

(MK-8) [32] . Ubiquinone is a major component of the 
aerobic respiratory chain. It is estimated that there 
are approximately 50 molecules of ubiquinone for^each 
of the oxidation complexes in E. coli [33] . By 

10 measuring an end product which is produced in 

relatively large quantities, it was conjectured that 
alterations in the rates of biosynthesis could be 
readily defected. The UQ-8 content of the recombinant 
dxps strains was 1.5 times greater than the controls 

15 (Fig. 5) . Lycopene and UQ-8 levels were measured in 
£. coli transformed with hmgrl from A. thaliana, to 
monitor if this caused any alterations in the 
isoprenoid content of the cells. Expression of the A. 
thaliana hmgrl cDNA had no effect of the levels of 

20 lycopene nor UQ-8 in the cells (Fig. 5) . 

The results show that increased expression of DXPS 
leads to increased lycopene and UQ-8 levels in the 
recombinant E. coli cells. This indicates that 

25 increasing the rate of DXP synthesis, the initial 

reaction in the mevalonate-independent pathway for IPP 
biosynthesis, elevates isoprenoid production. In 
contrast, expression of hmgrl had no effect on 
isoprenoid biosynthesis, suggesting that mevalonate 

30 dependent IPP biosynthesis has little or no role in 

IPP synthesis in E. coli. Similarity searches of the 
E. coli genome data base for proteins of the 
mevalonate-dependent IPP biosynthesis pathway failed 
to identify any possible homologs in the genome 

35 suggesting -hat this pathway is probably absent in 
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for UQ-8 where increases in the levels of UQ-8 are 
more restricted, which could be due to a 
rate-limiting reaction later in the UQ-8 biosynthesis 
pathway [34] . The results support the hypothesis that 
5 increased DXPS activity in E. coli results in 

increased levels of carotenoids and These data 

suggest that isoprenoid levels in E. coli can be 
increased by enhancing DXPS activity. - 

10 Transformation Protocols in Tomato Plants 

Triparental mating 

Liquid LB medium (5ml) containing rifampicin 

15 (100/^g/ml) was inoculated with a single Agrobacterium 
tumefaciens colony picked from an LB/rif plate. It 
was then incubated in a 27°C shaking incubator (225- 
250rpm) for 48 hours in the dark. Single colonies of 
Helper strain E.coli HB101/pRK2013 (kanamycin 

20 resistant) and the donor were also picked and grown up 
overnight at 37 °C in LB liquid medium with appropriate 
antibiotics. Following the incubation period each 
bacterial culture was centrifuged at 10,000rpm for 2 
minutes. The supernatants were discarded and the 

25 pellets resuspended in LB liquid medium. Aliquots of 
each strain (IQOjul each) were then mixed and spread 
with a sterile spreader onto an LB plate with no 
selection. The plate was inverted and incubated 
overnight at 27°C in the dark. A loopful of the 

30 overnight mating mix was then streaked onto a LB plate 
containing selective antibiotics (rifampicin, 100/ig/ml 
and kanamycin 50/zg/ml) . The plate was inverted and 
incubated for 48-72 hours at 27°C in the dark. Single 
colonies could then be selected for use in 

35 transformation of tomato explants. 
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Seed sterilisation 

Ailsa craig variety of tomato seeds were placed into a 
sterile 50ml Falcon tube. The seeds were washed with 
5 70% ethanol for 30 seconds and the ethanol removed. 

1% Virkon was then added and the tube incubated with 
shaking at 27°C for 20-30 minutes. 1% Virkon was then 
added and the tube incubated with shaking at 27 for 
20-30 minutes. The seeds were then .washed with 
10 sterile dH 2 0 (-500ml) through a sterile sieve. 

Seed sowing 

MS3S medium (125ml) was poured per sterile double 
15 Magenta pot (Sigma) and allowed to set. 

Five sterile seeds were then sown in each pot and 
incubated for 5 weeks in a control temperature room 
(27°C) under 5 cool white light tubes with 16 hours 
20 photoperiod and 70% relative humidizy. 

Explant preparation 

Plates were prepared for explant preparation by the 
25 addition of MS3C5ZR medium to petri dishes (25 plates 
per litre of medium). A sterile 8.5cm filter disc was 
then placed onto each plate. Plates were wrapped in 
cling film and stored at room temperature. Explants 
were taken under aseptic conditions for 5 week old 
30 seedlings. 1-1 -5cm sections from above cotyledons 
were cut and all leaves, roots and leaf nodes were 
removed. The explants were placed on a filter disc on 
pre-incubation medium (10 per plate as prepared in 
step 1. The plates were then sealed and stored at 
35 26°C with low light intensity. 
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A. tumefaciens culture preparation 

Several A. tumefaciens colonies from triparental 
mating containing either pVB6_TSEOLML or 
5 pVB6_35S_TSEC-LML were picked and used to inoculate LB 
liquid medium (lOmls) containing kanamycin (SO^g/ml) . 
The culture was incubated in a shaking incubator {225- 
250rpm) for 24 hours at 27°C. The overnight culture 
(lOmls) was added to LB liquid medium (50mls) 
10 containing kanamycin (SO^g/ml) . This second culture 
was then incubated for 24 hours at 27 °C in a shaking 
incubator (225-250rpm) . 

The A. tumefaciens culture (40mls) was then briefly 
15 centrifuged in a bench-top centrifuge (up to 3,000rpm) 
to remove clumps of growth. The supernatant was then 
carefully collected into a sterile 50ml Falcon tube. 
The supernatant was spun at 3,000rpm in a bench-top 
centrifuge for 10 minutes and the supernatant 
20 discarded. The pellet was resuspended in MS3S (30mls) 
by vortexing. The culture was diluted to l/10 th with 
MS3S and the optical density (OD) at 550nm measured 
with MS3S as a blank. The OD was adjusted to 0,1 with 
MS3S 20-25mls of culture was prepared for every 50 
25 explants transformed. 

Transformation of explants 

50 explants were prepared as above (5 plates) and were 
30 transferred into petri dishes and 25ml of A. 

tumefaciens solution per petri dish poured over them. 
They were then incubated at room temperature for 10 
minutes before being transferred to petri dishes 
containing a double layer of sterile filter paper. 
35 The explants were then transferred to plates 
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containing MS3SC5ZR medium (10 per plate) . The plates 
were sealed and then incubated in a control 
temperature room (27°C) for 48 hours. 

5 Selection 

The explants were transferred to selection media 
MS3C5RCK (10 explants per plate) and sealed before 
returning to the control temperature room for 2 weeks. 

10 

Subculture of explants 

Following selection explants were subcultured every 2 
weeks on MS3C5ZRCK medium. When shoots developed they 

15 were carefully excised and transferred to Phytatrays 
(Sigma) containing MS3C5CK. DNA samples for PCR 
analysis were collected when shoots were sufficiently 
developed. Once the shoots roored they were 
transferred to the glasshouse where initially they 

20 were placed in vermiculite with Ig/L Osmocote slow 
release fertiliser and then once roots were 
established they were transferred to soil. 

Constructs for transformation 

25 

pVB6_35S-TSEC-LML and pVB6-TSEC-LML are shown in 
diagrammatic form in Figures 7 and 6 respectively. 

Analysis of transf ormants 

30 

1. All transformation were tested for the transgene, 
using PCR with E.coli Dxps-specif ic primers: 

Forward: 5 1 -GCG CCG CTA TTT ACT CGA-3* 



35 
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Reverse: 5 1 TTT CTC TGG CGT GCC GCC-3 1 



10 



2. Those that were PCR positive were tested by 
Southern blot analysis for the number of inserts, 
using the nptll probe. 

3. Single insert transformation were then analysed 
for Dxps expression using RT-PCR, and the primers 
described in 1, above. 



4. Expressing lines were tested for DXPS protein 
levels using Western blots with an antibody 
specific for the E.coli protein. A band ca. 69 
kDa was found, showing both expression of 

15 transgene and cleavage of the transit peptide 

from the mature protein. 

5. Seed was collected from all single insert lines 
for sowing. 

20 

6. Tl progeny were cultivated for pigment analysis 
and inheritance of phenotype. 



Isoprenoids constitute a large group of compounds many 
25 of which are of high economic value. The condensation 
of (hydroxy) thiamin, derived from the decarboxylation 
of pyruvate, with glyceraldehyde-3-phosphate to yield 
1-deoxy-D- xylulose-5-phosphate , is considered to be 
the first reaction in the mevalonate-independent 
30 pathway for IPP and ultimately isoprenoid 

biosynthesis. The data presented show that increasing 
the rate of DXP synthesis in E. coli results in 
increased isoprenoid biosynthesis. This finding can 
therefore be utilised to optimise the industrial 
35 production of isoprenoids from bacteria. The 
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manipulation of enzyme activities important in the 
biosynthesis of specific isoprenoids in concert with 
DXPS may be employed to bioengineer the production of 
specific, high value isoprenoids in E. coli or another 
5 suitable cell or organism such as in plants where 
increased isoprenoid production could be used for 
improving crop flavour, fragrance and colour. 
Alternatively, crops could be engineered to produce 
increased concentrations of isoprenoids with 
10 pharmaceutical and/or nutritional properties. 

TABLE 1. DXP synthase activity in E* coli homogenates 



15 Specific activity Fold increase 
nmol/min/mg protein in activity 

Control 5.8 ± 0.07 1.0 

B. subtilis 11.5 ± 0.58 2.0 

Syn. sp. 6803 10.4 ± 0.24 1.8 



20 
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Claims 



1. A method of manipulating isoprenoid 
expression in a plant or plant cell having a 
mevalonate independent isopentyl diphosphate 
synthesising pathway, which method comprises altering 
the activity of the enzyme l-deoxy-D-xylulose-5- 
phosphate synthase (DXPS) , or a functional equivalent 
thereof. - 

2. A method according to claim 1 wherein said 
isoprenoid production is increased by enhancing the 
activity or expression of said DXPS or lowered by 
inhibiting the activity or expression of said DXPS 
enzyme . 

3. A method according to claim 2 wherein said 
enhanced DXPS activity occurs by transformation of 
said plant or plant cell with a vector comprising a 
nucleic acid molecule encoding said DXPS operably 
linked to an expression control sequence and 
optionally a reporter molecule 

4 . A method according to claim 3 wherein said 
DXPS encoded by said nucleic acid sequence is 
endogenous to said plant or plant cell. 

5. A method according to claim 3 or 4 wherein 
said vfector comprises one or more nucleic acid 
sequences encoding a polypeptide (s) capable of 
producing a desired isoprenoid. 

6. A method according to claim 3 or 4 wherein 
said plant or plant cell is transformed with a further 
vector comprising one or more nucleic acid sequences 
encoding a polypeptide (s) capable of producing a 
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desired isoprenoid. 

7. A method according to any of claims 3 to 6 
wherein said vector comprising said nucleic acid 
sequence (s) encoding said DXPS and/or said 
polypeptide (s) capable of producing said isoprenoid 
further comprises a nucleic acid sequence of either a 
tissue specific promoter and/or encoding a plastid 
transit peptide. 

8 . A method according to any of claims 5 to 7 
wherein said desired isoprenoid is one conferring a 
nutritional benefit. 

9. A method according to claim 8 wherein said 
isoprenoid comprises any of the carotenoids, vitamins 
E, Bl or B6, chlorophylls, phenylquinones or 
diterpenes . 

10. A plant or plant cell which has a mevalonate 
independent 1PP biosynthetic pathway and which is 
transformed or transfected with a vector comprising a 
nucleic acid sequence encoding DXPS or a functional 
equivalent, derivative or bioprecursor thereof 
operably linked to an expression control sequence. 

11. A plant or plant cell according to claim 10 
wherein said vector further comprises a nucleic acid 
molecule encoding a reporter molecule. 

12. A plant or plant cell according to claim 10 
or 11 which further comprises a vector comprising one 
or more nucleic acid sequences encoding one or more 
polypeptides capable of producing a desired 
isoprenoid. 
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13. A plant or plant cell according to claim 12 
wherein said desired isoprenoid comprises any of the 
carotenoids, vitamin E, Bl or B6, chlorophylls, 
phenylquinones, or diterpenes. 

14. A method of manipulating isoprenoid 
expression in a cell or organism having a mevalonate 
independent isopentyl diphosphate synthesising 
pathway, which method comprises altering the activity 
of the enzyme 3-deoxy-D-xylulose-5-phosphate synthase 
(DXPS) or a functional equivalent thereof by 
transforming said cell or organism with a vector 
comprising a nucleic acid optionally linked to an 
expression control sequence and operably a reporter 
molecule, and a further vector comprising one or more 
nucleic acid sequences encoding a polypeptide (s) 
capable of producing a desired isoprenoid. 

15. A method according to claim 14, wherein said 
nucleic acid sequence encoding said DXPS is endogenous 
to said cell or organism. 

16. A method according to claim 14 or 15, 
wherein said cell is any of a bacterial, yeast or 
algal cell. 

17. A method according to claim 16, wherein said 
bacterial cell is E. coll. 

18. A method according to claim 14 or 15 wherein 
said organism is a plant. 

19. A method according to any of claims 14 to 18 
wherein said cell is any of a bacterial, yeast or 
algal cell. 

20. A method according to any of claims 14 to 19 
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wherein said bacterial cell is E. colx. 

21. A method according to claim 20 wherein said 
organism is a plant. 

22. A method of identifying a compound which 
modulates isoprenoid activity or expression, said 
method comprising contacting said compound to be 
tested with a molecule which is a component of the 
mevalonate independent IPP biosynthetic pathway and 
which molecule undergoes a reaction catalysed by DXPS 
activity in the presence of an appropriate reactant, 
in the presence of DXPS or a functional equivalent 
thereof and monitoring the yield of a product of the 
reaction when compared to the same reaction performed 
in the absence of the compound to be tested. 

23. A method according to claim 22 wherein said 
molecule comprises pyruvate and said appropriate 
reactant comprises glyceraldehyde-3-phosphate or vice 
versa . 

24. A transgenic cell, tissue or organism having 
a mevalonate independent IPP biosynthetic pathway and 
increased isoprenoid activity which cell, tissue or 
organism comprises at least one transgene capable of 
expressing DXPS or a functional equivalent thereof. 

25. A transgenic cell, tissue or organism 
according to claim 24, which comprises at least one 
additional copy of any of the nucleic acid sequences 
identified in Figure 3, or the complement thereof. 

26. A transgenic cell, tissue or organism 
according to claim 24 or 25, further comprising a 
transgene capable of expressing one or more 
polypeptides capable of producing a desired 
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isoprenoid, or a functional equivalent. 

27. A transgenic cell, tissue or organism 
according to any of claims 24 to 2 6 , wherein said 
organism is a plant. 

28. A transgenic cell tissue or organism 
according to claim 27 , wherein said plant is of the 
Lycopersicon spp. _ 

29. • Progeny of the organism according to any of 
claims 24 to 28 having increased isoprenoid activity. 

30. A transformed plant comprising a transgene 
capable of expressing DXPS from E.coli having the 
sequence according to Figure 8 and which plant 
comprises a higher level of isoprenoid than an 
untransf ormed plant. 

31. A transformed plant according to claim 30 
comprising any of constructs pVB6__TSEC_LML or 
pVB6J35S_TSEC-LML illustrated in Figures 6 and 7. 

32. A transformed plant according to claim 30 or 
31 wherein said plant is a tomato plant. 

33. A tomato fruit produced by a plant according 
to claim 32 and having a higher level of isoprenoid 
activity than a wild type fruit. 

34. A seed produced by a plant according to 
claim 32 and having a higher level of isoprenoid 
activity than a seed from a wild type plant. 
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SEQUENCE LISTING 



09/890229 

JG17Rec'dPCT/PT0 2 7 JUL 20Oi 



<110> Bramley, Peter Michael 
Harker, Mark 

<120> Manipulating Isoprenoid Expression 

<130> B0192/7031 

<140> PCT/GB00/00263 
<141> 2000-01-28 

<150> GB 9901902.8 
<151> 1999-01-28 

<160> 12 

<170> Patentln version 3.0 

<210> 1 
<211> 640 
<212> PRT 

<213> Synechocystis sp. 
<400> 1 

Met His lie Ser Glu Leu Thr His Pro Asn Glu Leu Lys Gly Leu Ser 
15 10 15 

He Arg Glu Leu Glu Glu Val Ser Arg Gin He Arg Glu Lys His Leu 
20 25 30 

Gin Thr Val Ala Thr Ser Gly Gly His Leu Gly Pro Gly Leu Gly Val 
35 40 45 

Val Glu Leu Thr Val Ala Leu Tyr Ser Thr Leu Asp Leu Asp Lys Asp 
50 55 60 

Arg Val He Trp Asp Val Gly His Gin Ala Tyr Pro His Lys Met Leu 
65 70 75 80 

Thr Gly Arg Tyr His Asp Phe His Thr Leu Arg Gin Lys Asp Gly Val 

85 90 95 

Ala Gly Tyr Leu Lys Arg Ser Glu Ser Arg Phe Asp His Phe Gly Ala 
100 105 110 

Gly His Ala Ser Thr Ser He Ser Ala Gly Leu Gly Met Ala Leu Ala 
115 120 125 

Arg Asp Ala Lys Gly Glu Asp Phe Lys Val Val Ser He He Gly Asp 
130 135 140 

Gly Ala Leu Thr Gly Gly Met Ala Leu Glu Ala He Asn His Ala Gly 
145 150 155 160 



His Leu Pro His Thr Arg Leu Met Val He Leu Asn Asp Asn Glu Met 

165 170 175 



Ser lie Ser Pro 
180 

Arg Leu Ser Ser 
195 

lie Lys His Leu 
210 

Arg Val Lys Glu 
225 

Val He Glu Glu 



Ser Leu Gin Glu 
260 

Gly Pro Val Phe 
275 

Leu Ala Glu Lys 
290 

Leu Ser Thr Gly 
305 

Tyr Ser Lys Val 



Pro Asn He Val 
340 

Asp Lys Leu Gin 
355 

Ala Glu Gin His 
370 

He Arg Pro Val 
385 

Asp Gin He He 



Cys Leu Asp Arg 
420 

Gly Met Tyr Asp 
435 

Met Ala Pro Lys 
450 

Val Asn Tyr Thr 
465 

Gly He Gly Val 



Asn Val Gly Ala 



Pro Met Gin Phe 
200 

Pro Phe Val Gly 
215 

Gly Met Lys Arg 
230 

Leu Gly Phe Lys 
245 

Leu He Asp Thr 



Val His Val Ser 
280 

Asp Gin Val Gly 
295 

Lys Ala Tyr Pro 
310 

Phe Ala His Thr 
325 

Gly lie Thr Ala 



Ala Lys Leu Pro 
360 

Ala Val Thr Leu 
375 

Val Ala He Tyr 
390 

His Asp Val Cys 
405 

Ala Gly He Val 



He Ala Tyr Leu 
440 

Asp Glu Ala Glu 
455 

Gly Gly Ala He 
470 

Pro Leu Met Glu 
485 



2 



He Ser Arg Tyr 
185 

Leu Thr Asp Asn 



Asp Ser Leu Thr 
220 

Leu Val Val Pro 
235 

Tyr Phe Gly Pro 
250 

Phe Lys Gin Ala 
265 

Thr Thr Lys Gly 



Tyr His Ala Gin 
300 

Ser Ser Lys Pro 
315 

Leu Thr Thr Leu 
330 

Ala Met Ala Thr 
345 

Lys Gin Tyr Val 



Ala Ala Gly Met 
380 

Ser Thr Phe Leu 
395 

He Gin Lys Leu 
410 

Gly Ala Asp Gly 
425 

Arg Cys He Pro 



Leu Gin Gin Met 
460 

Ala Met Arg Tyr 
475 

Glu Gly Trp Glu 

490 



Leu Asn Lys Val 
190 

Leu Glu Glu Gin 
205 

Pro Glu Met Glu 



Lys Val Gly Ala 
240 

He Asp Gly His 
255 

Glu Lys Val Pro 
270 

Lys Gly Tyr Asp 
285 

Ser Pro Phe Asn 



Lys Pro Pro Ser 
320 

Ala Lys Glu Asn 
335 

Gly Thr Gly Leu 
350 

Asp Val Gly He 
365 

Ala Cys Glu Gly 



Gin Arg Gly Tyr 
400 

Pro Val Phe Phe 
415 

Pro Thr His Gin 
430 

Asn Leu Val Leu 
445 

Leu Val Thr Gly 



Pro Arg Gly Asn 
480 

Pro Leu Glu He 

495 



Gly Lys Ala Glu 
500 

Tyr Gly Ser Met 
515 

Glu His Gly He 
530 

Leu Asp Thr Glu 
545 

Val Thr Met Glu 



Ala Glu Ala Leu 
580 

Gly Val Pro Asp 
595 

Val Asp Leu Gly 
610 

Ser Leu Phe Lys 
625 



He Leu Arg Ser 



Val Tyr Pro Ala 
520 

Glu Ala Thr Val 
535 

Leu He Leu Pro 
550 

Glu Gly Cys Leu 
565 

Met Asp Asn Asn 



He Leu Val Asp 
600 

Leu Thr Pro Ala 
615 

Thr Glu Thr Glu 
630 



Gly Asp Asp Val 
505 

Leu Gin Thr Ala 



Val Asn Ala Arg 
540 

Leu Ala Glu Arg 
555 

Met Gly Gly Phe 
570 

Val Leu Val Pro 
585 

His Ala Thr Pro 



Gin Met Ala Gin 
620 

Ser Val Val Ala 
635 



Leu Leu Leu Gly 
510 

Glu Leu Leu His 
525 

Phe Val Lys Pro 



He Gly Lys Val 
560 

Gly Ser Ala Val 
575 

Leu Lys Arg Leu 
590 

Glu Gin Ser Thr 
605 

Asn He Met Ala 



Pro Gly Val Ser 
640 



<210> 2 

<211> 633 

<212> PRT 

<213> Bacillus 

<400> 2 

Met Asp Leu Leu 
1 

He Asp Glu Leu 
20 

Thr Ser Leu Ser 
35 

Val Glu Leu Thr 
50 

Lys Phe Leu Trp 
65 

Thr Gly Arg Gly 



Cys Gly Phe Pro 
100 

Gly His Ser Ser 
115 



subtilis 



Ser He Gin Asp 
5 

Glu Lys Leu Ser 



Ala Ser Gly Gly 
40 

Val Ala Leu His 
55 

Asp Val Gly His 
70 

Lys Glu Phe Ala 
85 

Lys Arg Ser Glu 



Thr Ser Leu Ser 
120 



Pro Ser Phe Leu 
10 

Asp Glu He Arg 
25 

His He Gly Pro 



Lys Glu Phe Asn 
60 

Gin Ser Tyr Val 
75 

Thr Leu Arg Gin 
90 

Ser Glu His Asp 
105 

Gly Ala Met Gly 



Lys Asn Met Ser 
15 

Gin Phe Leu He 
30 

Asn Leu Gly Val 
45 

Ser Pro Lys Asp 



His Lys Leu Leu 
80 

Tyr Lys Gly Leu 
95 

Val Trp Glu Thr 
110 

Met Ala Ala Ala 
125 



4 



Arg Asp lie Lys 
130 

Gly Ala Leu Thr 
145 

Asp Glu Lys Lys 



lie Ala Pro Asn 
180 

Thr Ala Gly Lys 
195 

Lys Lys lie Pro 
210 

Val Lys Asp Ser 
225 

Glu Leu Gly Phe 



Glu Leu lie Glu 
260 

Leu Leu His Val 
275 

Thr Asp Thr lie 
290 

Thr Gly Asp Phe 
305 

Leu Val Ser Gly 



Val Ala lie Thr 
340 

Ala Lys Glu Phe 
355 

His Ala Ala Thr 
370 

Phe Leu Ala lie 
385 

Val His Asp lie 



Arg Ala Gly Leu 
420 

Asp lie Ala Phe 
435 



Gly Thr Asp Glu 
135 

Gly Gly Met Ala 
150 

Asp Met He Val 
165 

Val Gly Ala He 



Tyr Gin Trp Val 
200 

Ala Val Gly Gly 
215 

Leu Lys Tyr Met 
230 

Thr Tyr Leu Gly 
245 

Asn Leu Gin Tyr 



He Thr Lys Lys 
280 

Gly Thr Trp His 
295 

Val Lys Pro Lys 
310 

Thr Val Gin Arg 
325 

Pro Ala Met Pro 



Pro Asp Arg Met 
360 

Met Ala Ala Ala 
375 

Tyr Ser Thr Phe 
390 

Cys Arg Gin Asn 
405 

Val Gly Ala Asp 



Met Arg His He 
440 



Tyr He He Pro 
140 

Leu Glu Ala Leu 

155 

He Leu Asn Asp 
170 

His Ser Met Leu 
185 

Lys Asp Glu Leu 



Lys Leu Ala Ala 
220 

Leu Val Ser Gly 
235 

Pro Val Asp Gly 
250 

Ala Lys Lys Thr 
265 

Gly Lys Gly Tyr 



Gly Thr Gly Pro 
300 

Ala Ala Ala Pro 
315 

Met Ala Arg Glu 
330 

Val Gly Ser Lys 
345 

Phe Asp Val Gly 



Met Ala Met Gin 
380 

Leu Gin Arg Ala 
395 

Ala Asn Val Phe 
410 

Gly Glu Thr His 
425 

Pro Asn Met Val 



He He Gly Asp 



Asn His lie Gly 
160 

Asn Glu Met Ser 
175 

Gly Arg Leu Arg 
190 

Glu Tyr Leu Phe 
205 

Thr Ala Glu Arg 



Met Phe Phe Glu 
240 

His Ser Tyr His 
255 

Lys Gly Pro Val 
270 

Lys Pro Ala Glu 
285 

Tyr Lys He Asn 



Ser Trp Ser Gly 
320 

Asp Gly Arg He 
335 

Leu Glu Gly Phe 
350 

He Ala Glu Gin 
365 

Gly Met Lys Pro 



Tyr Asp Gin Val 
400 

He Gly He Asp 
415 

Gin Gly Val Phe 
430 

Leu Met Met Pro 
445 



Lys Asp Glu Asn Glu 
450 

Asp Glu Gly Pro lie 
465 

Val Lys Met Asp Glu 

485 

Val Leu Arg Pro Gly 
500 

lie Glu Met Ala lie 
515 

Ser Val Arg Val Val 
530 

Met Met Lys Ser lie 
545 

Glu Ala Val Leu Glu 

565 

His Asp Gin Gly Glu 
580 

Asp Arg Phe He Glu 
595 

Gly Leu Thr Lys Gin 
610 

Pro Lys Thr His Lys 
625 



Gly Gin His Met Val His 
455 

Ala Met Arg Phe Pro Arg 
470 475 

Gin Leu Lys Thr He Pro 

490 

Asn Asp Ala Val lie Leu 
505 

Glu Ala Ala Glu Glu Leu 
520 

Asn Ala Arg Phe He Lys 
535 

Leu Lys Glu Gly Leu Pro 
550 555 

Gly Gly Phe Gly Ser Ser 

570 

Tyr His Thr Pro He Asp 
585 

His Gly Ser Val Thr Ala 
600 

Gin Val Ala Asn Arg lie 
615 

Gly He Gly Ser 
630 



Thr Ala Leu Ser Tyr 
460 

Gly Asn Gly Leu Gly 

480 

He Gly Thr Trp Glu 
495 

Thr Phe Gly Thr Thr 
510 

Gin Lys Glu Gly Leu 
525 

Pro He Asp Glu Lys 
540 

He Leu Thr He Glu 

560 

He Leu Glu Phe Ala 
575 

Arg Met Gly He Pro 
590 

Leu Leu Glu Glu He 
605 

Arg Leu Leu Met Pro 
620 



<210> 3 

<211> 620 

<212> PRT 

<213> Escherichia coli 



<400> 3 



Met Ser Phe Asp He Ala Lys Tyr 
1 5 

Thr Gin Glu Leu Arg Leu Leu Pro 
20 

Asp Glu Leu Arg Arg Tyr Leu Leu 
35 40 

His Phe Ala Ser Gly Leu Gly Thr 
50 55 

Tyr Val Tyr Asn Thr Pro Phe Asp 
65 70 

Gin Ala Tyr Pro His Lys He Leu 



Pro Thr Leu Ala Leu Val Asp Ser 
10 15 

Lys Glu Ser Leu Pro Lys Leu Cys 
25 30 

Asp Ser Val Ser Arg Ser Ser Gly 

45 

Val Glu Leu Thr Val Ala Leu His 
60 

Gin Leu He Trp Asp Val Gly His 
75 80 

Thr Gly Arg Arg Asp Lys He Gly 
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Thr lie Arg Gin 
100 

Ser Glu Tyr Asp 
115 

Ala Gly lie Gly 
130 

Arg Thr Val Cys 
145 

Phe Glu Ala Met 



lie Leu Asn Asp 
180 

Asn Asn His Leu 
195 

Arg Glu Gly Gly 
210 

Leu Leu Lys Arg 
225 

Thr Leu Phe Glu 



His Asp Val Leu 
260 

Lys Gly Pro Gin 
275 

Glu Pro Ala Glu 
290 

Asp Pro Ser Ser 
305 

Tyr Ser Lys lie 



Asn Lys Leu Met 
340 

Val Glu Phe Ser 
355 

Ala Glu Gin His 
370 

Tyr Lys Pro lie 
385 

Asp Gin Val Leu 



85 

Lys Gly Gly Leu 



Val Leu Ser Val 
120 

lie Ala Val Ala 
135 

Val He Gly Asp 
150 

Asn His Ala Gly 
165 

Asn Glu Met Ser 



Ala Gin Leu Leu 
200 

Lys Lys Val Phe 
215 

Thr Glu Glu His 
230 

Glu Leu Gly Phe 
245 

Gly Leu He Thr 



Phe Leu His He 
280 

Lys Asp Pro He 
295 

Gly Cys Leu Pro 
310 

Phe Gly Asp Trp 
325 

Ala He Thr Pro 



Arg Lys Phe Pro 
360 

Ala Val Thr Phe 
375 

Val Ala He Tyr 
390 

His Asp Val Ala 



90 

His Pro Phe Pro 
105 

Gly His Ser Ser 



Ala Glu Lys Glu 
140 

Gly Ala He Thr 
155 

Asp He Arg Pro 
170 

He Ser Glu Asn 
185 

Ser Gly Lys Leu 



Ser Gly Val Pro 
220 

He Lys Gly Met 
235 

Asn Tyr He Gly 
250 

Thr Leu Lys Asn 
265 

Met Thr Lys Lys 



Thr Phe His Ala 
300 

Lys Ser Ser Gly 
315 

Leu Cys Glu Thr 
330 

Ala Met Arg Glu 
345 

Asp Arg Tyr Phe 



Ala Ala Gly Leu 
380 

Ser Thr Phe Leu 
395 

He Gin Lys Leu 



95 

Trp Arg Gly Glu 
110 

Thr Ser He Ser 
125 

Gly Lys Asn Arg 



Ala Gly Met Ala 

160 

Asp Met Leu Val 
175 

Val Gly Ala Leu 
190 

Tyr Ser Ser Leu 
205 

Pro He Lys Glu 



Val Val Pro Gly 
240 

Pro Val Asp Gly 
255 

Met Arg Asp Leu 
270 

Gly Arg Gly Tyr 
285 

Val Pro Lys Phe 



Gly Leu Pro Ser 
320 

Ala Ala Lys Asp 
335 

Gly Ser Gly Met 
350 

Asp Val Ala He 
365 

Ala He Gly Gly 



Gin Arg Ala Tyr 
400 

Pro Val Leu Phe 



405 



410 



415 



Ala lie Asp Arg Ala Gly lie 
420 

Gly Ala Phe Asp Leu Ser Tyr 
435 



Val Gly Ala Asp Gly Gin 
425 



Thr His Gin 
430 



Leu Arg Cys lie Pro Glu Met Val lie 
440 445 



Met Thr Pro Ser Asp Glu Asn 
450 455 



Glu Cys Arg Gin Met Leu Tyr Thr Gly 

460 



Tyr 
465 



His Tyr Asn Asp 



Gly Pro Ser Ala Val Arg Tyr Pro 
470 475 



Arg Gly Asn 
480 



Ala Val Gly Val Glu Leu Thr 

485 



Pro Leu Glu Lys Leu Pro 
490 



lie Gly Lys 
495 



Gly lie Val Lys Arg Arg Gly 
500 



Glu Lys Leu Ala lie Leu 
505 



Asn Phe Gly 
510 



Thr Leu Met Pro Glu Ala Ala 
515 



Lys Val Ala Glu Ser Leu Asn Ala Thr 
520 525 



Leu Val Asp Met Arg Phe Val 
530 535 



Lys Pro Leu Asp Glu Ala Leu lie Leu 

540 



Glu 
545 



Met Ala Ala Ser 



His Glu Ala Leu Val Thr Val Glu 
550 555 



Glu Asn Ala 
560 



He Met Gly Gly Ala Gly Ser 

565 



Gly Val Asn Glu Val Leu 
570 



Met Ala His 
575 



Arg Lys Pro Val Pro Val Leu 
580 



Asn He Gly Leu Pro Asp 
585 



Phe Phe He 
590 



Pro Gin Gly Thr Gin Glu Glu 
595 



Met Arg Ala Glu Leu Gly Leu Asp Ala 
600 605 



Ala Gly Met Glu Ala Lys He 
610 615 



Lys Ala Trp Leu Ala 

620 



<210> 4 
<211> 184 
<212> PRT 

<213> Lycopersicon esculentum 
<400> 4 

Met Ala Leu Cys Ala Tyr Ala Phe Pro Gly lie Leu Asn Arg Thr Gly 
15 10 15 

Val Val Ser Asp Ser Ser Lys Ala Thr Pro Leu Phe Ser Gly Trp He 
20 25 30 

His Gly Thr Asp Leu Gin Phe Leu Phe Gin His Lys Leu Thr His Glu 
35 40 45 



Val Lys Lys Arg Ser Arg Val Val Gin Ala Ser Leu Ser Glu Ser Gly 
50 55 60 
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Glu Tyr Tyr Thr Gin Arg Pro Pro Thr Pro lie Leu Asp Thr Val Asn 
65 70 75 80 

Tyr Pro lie His Met Lys Asn Leu Ser Leu Lys Glu Leu Lys Gin Leu 

85 90 95 

Ala Asp Glu Leu Arg Ser Asp Thr He Phe Asn Val Ser Lys Thr Gly 
100 105 110 

Gly His Leu Gly Ser Ser Leu Gly Val Val Glu Leu Thr Val Ala Leu 
115 120 125 

His Tyr Val Phe Asn Ala Pro Gin Asp Arg He Leu Trp Asp Val Gly 
130 135 140 

His Gin Ser Tyr Pro His Lys He Leu Thr Gly Arg Arg Asp Lys Met 
145 150 155 160 

Ser Thr Leu Arg Gin Thr Asp Gly Leu Ala Gly Phe Thr Lys Arg Ser 

165 170 175 

Glu Ser Glu Tyr Asp Cys Phe Gly 
180 



<210> 5 
<211> 16032 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid 
p VB 6_3 5 S_T S E C__LML 

<220> 

<221> unsure 

<222> (1641) . . (1641) 

<223> n = a, c, g, or t/u 

<220> 

<221> unsure 

<222> (1644) . . (1644) 

<223> n — a, c, g, or t/u 

<220> 

<221> unsure 

<222> (1652) . . (1652) 

<223> n = a, c, g, or t/u 

<220> 

<221> unsure 

<222> (1656) . . (1656) 

<223> n = a, c, g, or t/u 

<400> 5 

atgattgaac aagatggatt gcacgcaggt tctccggccg cttgggtgga gaggctattc 60 
ggctatgact gggcacaaca gacaatcggc tgctctgatg ccgccgtgtt ccggctgtca 120 
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gcgcaggggc 


gcccggttct 


ttttgtcaag 


accgacctgt 


ccggtgccct 


gaatgaactg 


180 


cagcgcggct 


atcgtggctg 


gccacgacgg 


gcgttccttg 


cgcagctgtg 


ctcgacgttg 


240 


tcactgaagc 


gggaagggac 


tggctgctat 


tgggcgaagt 


gccggggcag 


gatctcctgt 


300 


catctcacct 


tgctcctgcc 


gagaaagtat 


ccatcatggc 


tgatgcaatg 


cggcggctgc 


360 


atacgcttga 


tccggctacc 


tgcccattcg 


accaccaagc 


gaaacatcgc 


atcgagcgag 


420 


cacgtactcg 


gatggaagcc 


ggtcttgtcg 


atcaggatga 


tctggacgaa 


gagcatcagg 


480 


ggctcgcgcc 


agccgaactg 


ttcgccaggc 


tcaaggcgcg 


catgcccgac 


ggcgaggatc 


540 


tcgtcgtgac 


ccatggcgat 


gcctgcttgc 


cgaatatcat 


ggtggaaaat 


ggccgctttt 


600 


ctggattcat 


cgactgtggc 


cggctgggtg 


tggcggaccg 


ctatcaggac 


atagcgttgg 


660 


ctacccgtga 


tattgctgaa 


gagcttggcg 


gcgaatgggc 


tgaccgcttc 


ctcgtgcttt 


720 


acggtatcgc 


cgctcccgat 


tcgcagcgca 


tcgccttcta 


tcgccttctt 


gacgagttct 


780 


tctgagaatt 


ctttagcctt 


taaattgcta 


gttttcgtta 


aatggtcata 


acttgagcta 


840 


tggacctcca 


aattaaattt 


cgggcatacg 


ctcaaatccc 


aaattacgat 


acggagctac 


900 


cggaactgtc 


aaaatactga 


tccgggtccg 


tttgctaaaa 


acgttgacca 


aagtccacta 


960 


agttgagttt 


taaaacttta 


tttcacattt 


taatccattt 


tttacatgaa 


aactttccgg 


1020 


aaaatacgga 


gtatgcacgc 


aagtcgagga 


atgataaatg 


gtacttttcg 


aagttttaga 


1080 


actcaaaatt 


acttattaaa 


tttaaagatg 


acattttggg 


tcatcacatt 


gatgaaaatt 


1140 


ttgacattaa 


tatctgagaa 


ctttctttga 


cctttttcga 


ttctaatcca 


atcaattcaa 


1200 


cagtgtaagg 


tgaagcagtc 


aatttaaagg 


aaggccttta 


aattctaaaa 


tattgtactt 


1260 


ttcctgcgct 


tctaaaagtg 


aacgacaaag 


aaaaaatagt 


tattcttgaa 


cttaatattg 


1320 


tacaatagga 


taaattttaa 


ctatctataa 


aaagagaaca 


aaaccttaat 


ctcttcaaaa 


1380 


taatattata 


agaagtaaca 


taattgtcaa 


atgaaataca 


cataagaagc 


acataaattt 


1440 


aaatgccgta 


ttaaacttac 


agtatactat 


agcggaagtt 


ggcttgataa 


aggaacgctg 


1500 


aggagagtag 


ccgatggtga 


aacactaaca 


tcaagtgcaa 


aagaaagaaa 


aactgaaaac 


1560 


agaagatgaa 


tgtttgaagt 


gggtaaaaga 


ttacttaaaa 


gataggtttg 


gttaacaaat 


1620 


gattgtgact 


gttacycrsc 


nscntmdstr 


anstsncatg 


gctttgtgtg 


cttatgcatt 


1680 


tcct gggatt 


ttgaacagga 


ctqqtqtqqt 


tt cagattct 


tctaaggcaa 


cccctttgtt 


1740 


ctctggatgg 


attcatggaa 


cagatctgca 


gtttttgttc 


caacacaagc 


ttactcatga 


1800 


ggtcaagaaa 


aggtcacgtg 


tggttcaggc 


ttccttatca 


gaatctggag 


aatactacac 


1860 


acagagaccg 


ccaacgccta 


ttttggacac 


tgtgaactat 


cccattcata 


tgaaaaatct 


1920 
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gtctctgaag 


gaacttaaac 


aactagcaga 


tgaactaagg 


tcagatacaa 


ttttcaatgt 


1980 


atcaaagact 


gggggtcacc 


ttggctcaag 


tcttggtgtt 


gttgagctga 


ctgttgctct 


2040 


tcattatgtc 


ttcaatgcac 


cgcaagatag 


gattctctgg 


gatgttggtc 


atcagtctta 


2100 


tcctcacaaa 


atcttgactg 


gtagaaggga 


caagatgtcg 


acattaaggc 


agacagatgg 


2160 


tcttgcagga 


tttactaagc 


gatcggagag 


tgaatatgat 


tgctttgatg 


agttttgata 


2220 


ttgccaaata 


cccgaccctg 


gcactggtcg 


actccaccca 


ggagttacga 


ctgttgccga 


2280 


aagagagttt 


accgaaactc 


tgcgacgaac 


tgcgccgcta 


tttactcgac 


agcgtgagcc 


2340 


gttccagcgg 


gcacttcgcc 


tccgggctgg 


gcacggtcga 


actgaccgtg 


gcgctgcact 


2400 


atgtctacaa 


caccccgttt 


gaccaattga 


tttgggatgt 


ggggcatcag 


gcttatccgc 


2460 


ataaaatttt 


gaccggacgc 


cgcgacaaaa 


tcggcaccat 


ccgtcagaaa 


ggcggtctgc 


2520 


acccgttccc 


gtggcgcggc 


gaaagcgaat 


atgacgtatt 


aagcgtcggg 


cattcatcaa 


2580 


cctccatcag 


tgccggaatt 


ggtattgcgg 


ttgctgccga 


aaaagaaggc 


aaaaatcgcc 


2640 


hQ gcaccgtctg 


tgtcattggc 


gatggcgcga 


ttaccgcagg 


catggcgttt 


gaagcgatga 


2700 


;.y atcacgcggg 


cgatatccgt 


cctgatatgc 


tggtgattct 


caacgacaat 


gaaatgtcga 


2760 


; ; y tttccgaaaa 


tgt cggcgcg 


ctcaacaacc 


atctggcaca 


gctgctttcc 


ggtaagcttt 


2820 


actcttcact 

'i SEE' 


gcgcgaaggc 


gggaaaaaag 


ttttctctgg 


cgtgccgcca 


attaaagagc 


2880 


^ tgctcaaacg 


caccgaagaa 


catattaaag 


gcatggtagt 


gcctggcacg 


ttgtttgaag 


2940 


M :: agctgggctt 


taactacatc 


ggcccggtgg 


acggtcacga 


tgtgctgggg 


cttatcacca 


3000 


%j cgctaaagaa 


catgcgcgac 


ctgaaaggcc 


cgcagttcct 


gcatatcatg 


accaaaaaag 


3060 


12 gtcgtggtta 


tgaaccggca 


gaaaaagacc 


cgatcacttt 


ccacgccgtg 


cctaaatttg 


3120 


atccctccag 


cggttgtttg 


ccgaaaagta 


gcggcggttt 


gccgagctat 


tcaaaaatct 


3180 


ttggcgactg 


gttgtgcgaa 


acggcagcga 


aagacaacaa 


gctgatggcg 


attactccgg 


3240 


cgatgcgtga 


aggttccggc 


atggtcgagt 


tttcacgtaa 


attcccggat 


cgctacttcg 


3300 


acgtggcaat 


tgccgagcaa 


cacgcggtga 


cctttgctgc 


gggtctggcg 


attggtgggt 


3360 


acaaacccat 


tgtcgcgatt 


tactccactt 


tcctgcaacg 


cgcctatgat 


caggtgctgc 


3420 


atgacgtggc 


gattcaaaag 


cttccggtcc 


tgttcgccat 


cgaccgcgcg 


ggcattgttg 


3480 


gtgctgacgg 


tcaaacccat 


cagggtgctt 


ttgatctctc 


ttacctgcgc 


tgcataccgg 


3540 


aaatggtcat 


tatgaccccg 


agcgatgaaa 


acgaatgtcg 


ccagatgctc 


tataccggct 


3600 


atcactataa 


cgatggcccg 


tcagcggtgc 


gctacccgcg 


tggcaacgcg 


gtcggcgtgg 


3660 


aactgacgcc 


gctggaaaaa 


ctaccaattg 


gcaaaggcat 


tgtgaagcgt 


cgtggcgaga 


3720 
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aactggcgat ccttaacttt ggtacgctga tgccagaagc ggcgaaagtc gccgaatcgc 3780 

tgaacgccac gctggtcgat atgcgttttg tgaaaccgct tgatgaagcg ttaattctgg 3840 

aaatggccgc cagccatgaa gcgctggtca ccgtagaaga aaacgccatt atgggcggcg 3900 

caggcagcgg cgtgaacgaa gtgctgatgg cccatcgtaa accagtaccc gtgctgaaca 3960 

ttggcctgcc ggacttcttt attccgcaag gaactcagga agaaatgcgc gccgaactcg 4020 

gcctcgatgc cgctggtatg gaagccaaaa tcaaggcctg gctggcataa cgaattcccg 4080 

atctagtaac atagatgaca ccgcgcgcga taatttatcc tagtttgcgc gctatatttt 4140 

gttttctatc gcgtattaaa tgtataattg cgggactcta atcataaaaa cccatctcat 4200 

aaataacgtc atgcacctga atagatcttg gacaagcgtt aggcctatct gtgcattaca 4260 

tgttaattat tacatgctta acgtaattca acagaaatta tatgataatc atcgcaagac 4320 

cggcaacagg attcaatctt aagaaacttt attgccaaat gtttgaacga tcggggaaat 4380 

tcgagctccc gggctggttg ccctcgccgc tgggctggcg gccgtctatg gccctgcaaa 4440 

cgcgccagaa acgccgtcga agccgtgtgc gagacaccgc ggccgccggc gttgtggata 4500 

cctcgcggaa aacttggccc tcactgacag atgaggggcg gacgttgaca cttgaggggc 4560 

cgactcaccc ggcgcggcgt tgacagatga ggggcaggct cgatttcggc cggcgacgtg 4 620 

gagctggcca gcctcgcaaa tcggcgaaaa cgcctgattt tacgcgagtt tcccacagat 4 680 

gatgtggaca agcctgggga taagtgccct gcggtattga cacttgaggg gcgcgactac 4 740 

tgacagatga ggggcgcgat ccttgacact tgaggggcag agtgctgaca gatgaggggc 4 800 

gcacctattg acatttgagg ggctgtccac aggcagaaaa tccagcattt gcaagggttt 4860 

ccgcccgttt ttcggccacc gctaacctgt cttttaacct gcttttaaac caatatttat 4920 

aaaccttgtt tttaaccagg gctgcgccct gtgcgcgtga ccgcgcacgc cgaagggggg 4 980 

tgccccccct tctcgaaccc tcccggcccg ctaacgcggg cctcccatcc ccccaggggc 5040 

tgcgcccctc ggccgcgaac ggcctcaccc caaaaatggc agcgctggca gtccttgcca 5100 

ttgccgggat cggggcagta acgggatggg cgatcagccc gagcgcgacg cccggaagca 5160 

ttgacgtgcc gcaggtgctg gcatcgacat tcagcgacca ggtgccgggc agtgagggcg 5220 

gcggcctggg tggcggcctg cccttcactt cggccgtcgg ggcattcacg gacttcatgg 5280 

cggggccggc aatttttacc ttgggcattc ttggcatagt ggtcgcgggt gccgtgctcg 5340 

tgttcggggg tgcgataaac ccagcgaacc atttgaggtg ataggtaaga ttataccgag 5400 

gtatgaaaac gagaattgga cctttacaga attactctat gaagcgccat atttaaaaag 54 60 

ctaccaagac gaagaggatg aagaggatga ggaggcagat tgccttgaat atattgacaa 5520 
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tactgataag ataatatatc ttttatatag aagatatcgc cgtatgtaag gatttcaggg 5580 

ggcaaggcat aggcagcgcg cttatcaata tatctataga atgggcaaag cataaaaact 5640 

tgcatggact aatgcttgaa acccaggaca ataaccttat agcttgtaaa ttctatcata 5700 

attgggtaat gactccaact tattgatagt gttttatgtt cagataatgc ccgatgactt 57 60 

tgtcatgcag ctccaccgat tttgagaacg acagcgactt ccgtcccagc cgtgccaggt 5820 

gctgcctcag attcaggtta tgccgctcaa ttcgctgcgt atatcgcttg ctgattacgt 5880 

gcagctttcc cttcaggcgg gattcataca gcggccagcc atccgtcatc catatcacca 5940 

cgtcaaaggg tgaagcaggc tcataagacg ccccagcgtc gccatagtgc gttcaccgaa 6000 

tacgtgcgca acaaccgtct tccggagact gtcatacgcg taaaacagcc agcgctggcg 6060 

cgatttagcc ccgacatagc cccactgttc gtccatttcc gcgcagacga tgacgtcact 6120 

gcccggctgt atgcgcgagg ttaccgactg cggcctgagt tttttaagtg acgtaaaatc 6180 

gtgttgaggc caacgcccat aatgcgggct gttgcccggc atccaacgcc attcatggcc 6240 

atatcaatga ttttctggtg cgtaccgggt tgagaagcgg tgtaagtgaa ctgcagttgc 6300 

catgttttac ggcagtgaga gcagagatag cgctgatgtc cggcggtgct tttgccgtta 6360 

cgcaccaccc cgtcagtagc tgaacaggag ggacagctga tagaaacaga agccactgga 6420 

gcacctcaaa aacaccatca tacactaaat cagtaagttg gcagcatcac ccataattgt 6480 

ggtttcaaaa tcggctccgt cgatactatg ttatacgcca actttgaaaa caactttgaa 6540 

aaagctgttt tctggtattt aaggttttag aatgcaagga acagtgaatt ggagttcgtc 6600 

ttgttataat tagcttcttg gggtatcttt aaatactgta gaaaagagga aggaaataat 6660 

aaatggctaa aatgagaata tcaccggaat tgaaaaaact gatcgaaaaa taccgctgcg 6720 

taaaagatac ggaaggaatg tctcctgcta aggtatataa gctggtggga gaaaatgaaa 6780 

acctatattt aaaaatgacg gacagccggt ataaagggac cacctatgat gtggaacggg 6840 

aaaaggacat gatgctatgg ctggaaggaa agctgcctgt tccaaaggtc ctgcactttg 6900 

aacggcatga tggctggagc aatctgctca tgagtgaggc cgatggcgtc ctttgctcgg 6960 

aagagtatga agatgaacaa agccctgaaa agattatcga gctgtatgcg gagtgcatca 7020 

ggctctttca ctccatcgac atatcggatt gtccctatac gaatagctta gacagccgct 7080 

tagccgaatt ggattactta ctgaataacg atctggccga tgtggattgc gaaaactggg 7140 

aagaagacac tccatttaaa gatccgcgcg agctgtatga ttttttaaag acggaaaagc 7200 

ccgaagagga acttgtcttt tcccacggcg acctgggaga cagcaacatc tttgtgaaag 72 60 

atggcaaagt aagtggcttt attgatcttg ggagaagcgg cagggcggac aagtggtatg 7320 
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acattgcctt 


ctgcgtccgg 


tcgatcaggg 


aggatatcgg 


ggaagaacag 


tatgtcgagc 


7380 


tattttttga 


cttactgggg 


atcaagcctg 


attgggagaa 


aataaaatat 


tatattttac 


7440 


tggatgaatt 


gttttagtac 


ctagatgtgg 


cgcaacgatg 


ccggcgacaa 


gcaggagcgc 


7500 


accgacttct 


tccgcatcaa 


gtgttttggc 


tctcaggccg 


aggcccacgg 


caagtatttg 


7560 


ggcaaggggt 


cgctggtatt 


cgtgcagggc 


aagattcgga 


ataccaagta 


cgagaaggac 


7620 


ggccagacgg 


tctacgggac 


cgacttcatt 


gccgataagg 


tggattatct 


ggacaccaag 


7680 


gcaccaggcg 


ggtcaaatca 


ggaataaggg 


cacattgccc 


cggcgtgagt 


cggggcaatc 


7740 


ccgcaaggag 


ggtgaatgaa 


tcggacgttt 


gaccggaagg 


catacaggca 


agaactgatc 


7800 


gacgcggggt 


tttccgccga 


ggatgccgaa 


accatcgcaa 


gccgcaccgt 


catgcgtgcg 


7860 


ccccgcgaaa 


ccttccagtc 


cgtcggctcg 


atggtccagc 


aagctacggc 


caagatcgag 


7920 


cgcgacagcg 


tgcaactggc 


tccccctgcc 


ctgcccgcgc 


catcggccgc 


cgtggagcgt 


7980 


tcgcgtcgtc 


tcgaacagga 


ggcggcaggt 


ttggcgaagt 


cgatgaccat 


cgacacgcga 


8040 


ggaactatga 


cgaccaagaa 


gcgaaaaacc 


gccggcgagg 


acctggcaaa 


acaggtcagc 


8100 


gaggccaagc 


aggccgcgtt 


gctgaaacac 


acgaagcagc 


agatcaagga 


aatgcagctt 


8160 


tccttgttcg 


atattgcgcc 


gtggccggac 


acgatgcgag 


cgatgccaaa 


cgacacggcc 


8220 


cgctctgccc 


tgttcaccac 


gcgcaacaag 


aaaatcccgc 


gcgaggcgct 


gcaaaacaag 


8280 


gtcattttcc 


acgtcaacaa 


ggacgtgaag 


atcacctaca 


ccggcgtcga 


gctgcgggcc 


8340 


gacgatgacg 


aactggtgtg 


gcagcaggtg 


ttggagtacg 


cgaagcgcac 


ccctatcggc 


8400 


gagccgatca 


ccttcacgtt 


ctacgagctt 


tgccaggacc 


tgggctggtc 


gatcaatggc 


8460 


cggtattaca 


cgaaggccga 


ggaatgcctg 


tcgcgcctac 


aggcgacggc 


gatgggcttc 


8520 


acgtccgacc 


gcgttgggca 


cctggaatcg 


gtgtcgctgc 


tgcaccgctt 


ccgcgtcctg 


8580 


gaccgtggca 


agaaaacgtc 


ccgttgccag 


gtcctgatcg 


acgaggaaat 


cgtcgtgctg 


8640 


tttgctggcg 


accactacac 


gaaattcata 


tgggagaagt 


accgcaagct 


gtcgccgacg 


8700 


gcccgacgga 


tgttcgacta 


tttcagctcg 


caccgggagc 


cgtacccgct 


caagctggaa 


8760 


accttccgcc 


tcatgtgcgg 


atcggattcc 


acccgcgtga 


agaagtggcg 


cgagcaggtc 


8820 


ggcgaagcct 


gcgaagagtt 


gcgaggcagc 


ggcctggtgg 


aacacgcctg 


ggtcaatgat 


8880 


gacct ggt gc 


at tgcaaacg 


ct agggcct t 


crtcrcrcTcrt cacr 

^3 ^- s ~? :d , -^- ,u y 


t" t cccrcrct" ncr 


ncr rft~ +" p^rrr'^ 

y y u 1 — ^ o. vj i. — cL 


8 940 

U J 1 u 


gccagcgctt 


tactggcatt 


tcaggaacaa 


gcgggcactg 


ctcgacgcac 


ttgcttcgct 


9000 


cagtatcgct 


cgggacgcac 


ggcgcgctct 


acgaactgcc 


gataaacaga 


ggattaaaat 


9060 


tgacaattgt 


gattaaggct 


cagattcgac 


ggcttggagc 


ggccgacgtg 


caggatttcc 


9120 
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gcgagatccg 


attgtcggcc 


ctgaagaaag 


ctccagagat 


gttcgggtcc 


gtttacgagc 


9180 


acgaggagaa 


aaagcccatg 


gaggcgttcg 


ctgaacggtt 


gcgagatgcc 


gtggcattcg 


9240 


gcgcctacat 


cgacggcgag 


atcattgggc 


tgtcggtctt 


caaacaggag 


gacggcccca 


9300 


aggacgctca 


caaggcgcat 


ctgtccggcg 


ttttcgtgga 


gcccgaacag 


cgaggccgag 


9360 


gggtcgccgg 


tatgctgctg 


cgggcgttgc 


cggcgggttt 


attgctcgtg 


atgatcgtcc 


9420 


gacagattcc 


aacgggaatc 


tggtggatgc 


gcatcttcat 


cctcggcgca 


cttaatattt 


9480 


cgctattctg 


gagcttgttg 


tttatttcgg 


tctaccgcct 


gccgggcggg 


gtcgcggcga 


9540 


cggtaggcgc 


tgtgcagccg 


ctgatggtcg 


tgttcatctc 


tgccgctctg 


ctaggtagcc 


9600 


cgatacgatt 


gatggcggtc 


ctgggggcta 


tttgcggaac 


tgcgggcgtg 


gcgctgttgg 


9660 


tgttgacacc 


aaacgcagcg 


ctagatcctg 


tcggcgtcgc 


agcgggcctg 


gcgggggcgg 


9720 


tttccatggc 


gttcggaacc 


gtgctgaccc 


gcaagtggca 


acctcccgtg 


cctctgctca 


9780 


cctttaccgc 


ctggcaactg 


gcggccggag 


gacttctgct 


cgttccagta 


gctttagtgt 


9840 


ttgatccgcc 


aatcccgatg 


cctacaggaa 


ccaatgttct 


cggcctggcg 


tggctcggcc 


9900 


tgatcggagc 


gggtttaacc 


tacttccttt 


ggttccgggg 


gatctcgcga 


ctcgaaccta 


9960 


cagttgtttc 


cttactgggc 


tttctcagcc 


ccagatctgg 


ggtcgatcag 


ccggggatgc 


10020 


atcaggccga 


cagtcggaac 


ttcgggtccc 


cgacctgtac 


cattcggtga 


gcaatggata 


10080 


ggggagttga 


tatcgtcaac 


gttcacttct 


aaagaaatag 


cgccactcag 


cttcctcagc 


10140 


ggctttatcc 


agcgatttcc 


tattatgtcg 


gcatagttct 


caagatcgac 


agcctgtcac 


10200 


ggttaagcga 


gaaatgaata 


agaaggctga 


taattcggat 


ctctgcgagg 


gagatgatat 


10260 


ttgatcacag 


gcagcaacgc 


tctgtcatcg 


ttacaatcaa 


catgctaccc 


tccgcgagat 


10320 


catccgtgtt 


tcaaacccgg 


cagcttagtt 


gccgttcttc 


cgaatagcat 


cggtaacatg 


10380 


agcaaagtct 


gccgccttac 


aacggctctc 


ccgctgacgc 


cgtcccggac 


tgatgggctg 


10440 


cctgtatcga 


gtggtgattt 


tgtgccgagc 


tgccggtcgg 


ggagctgttg 


gctggctggt 


10500 


ggcaggatat 


attgtggtgt 


aaacaaattg 


acgcttagac 


aacttaataa 


cacattgcgg 


10560 


acgtttttaa 


tgtactgggg 


tggtttttct 


tttcaccagt 


gagacgggca 


acagctgatt 


10620 


gcccttcacc 


gcctggccct 


gagagagttg 


cagcaagcgg 


tccacgctgg 


tttgccccag 


10680 


caggcgaaaa 


r cctz guxnga 


tggt ggttcc 


gaaat cggca 


aaat ccctt a 


taaat caaaa 


10/4 0 


gaatagcccg 


agatagggtt 


gagtgttgtt 


ccagtttgga 


acaagagtcc 


actattaaag 


10800 


aacgtggact 


ccaacgtcaa 


agggcgaaaa 


accgtctatc 


agggcgatgg 


cccactacgt 


10860 


gaaccatcac 


ccaaatcaag 


ttttttgggg 


tcgaggtgcc 


gtaaagcact 


aaatcggaac 


10920 
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cctaaaggga 


gcccccgatt 


tagagcttga 


eggggaaage 


eggegaaegt 


ggcgagaaag 


10980 


gaagggaaga 


aagegaaagg 


agcgggcgcc 


attcaggctg 


cgcaactgtt 


gggaagggcg 


11040 


atcggtgcgg 


gcctcttcgc 


tattacgeca 


gctggcgaaa 


gggggatgtg 


ctgcaaggcg 


11100 


attaagttgg 


gtaacgccag 


ggttttccca 


gtcacgacgt 


tgtaaaacga 


cggccagtga 


11160 


attgaagctt 


ggcgcgccaa 


gcttggtacc 


ctcaggacta 


gaecategtg 


gttaaatgat 


11220 


caagtgccta 


cttggcagaa 


tttctttcga 


gcagcctcct 


cctacaagtt 


gcatttgttg 


11280 


cgcttacgat 


aattgtcaaa 


gaagtaggta 


aaataaagac 


atgatctcta 


atattaagga 


11340 


taagattaaa 


aataagtcca 


ggattacccg 


gtcggcccat 


caattacttg 


ctgacctttg 


11400 


ttgccgtccc 


acgacttcca 


ttttctaacc 


gtccattttc 


atttgttttt 


agctatattt 


11460 


aatattaatg 


ggatataaat 


tataaacatt 


cctcctccca 


aaaaaataag 


tttaagtaat 


11520 


actgcaatag 


acagtgtttt 


aagccatgta 


attcagtaaa 


agttcttttt 


tattctgaac 


11580 


ctagccctaa 


aaaggccatg 


egggtaatta 


gttcagtcaa 


ctgaatatac 


aacgttttga 


11640 


accaaagtta 


acatgtacag 


gecaatagaa 


gttatttgac 


cgtaagctta 


gtctctacat 


11700 


tcattcaacg 


ttcttgaatc 


aaagtgacct 


gtacaggeca 


atagaagtta 


cctgaccgta 


11760 


agcttagtct 


ctacattcat 


tcctctgaga 


cgatattcta 


gaagectget 


ttcaagtcta 


11820 


aaaggcacaa 


tcttttctcc 


tcaccacttg 


ttgaggtact 


tatgatttta 


aagatgaaac 


11880 


atttttttta 


cttttcccct 


ttaatttctt 


tgattttttt 


ttttcttggt 


agttggaagt 


11940 


acttttcata 


ccctagaaaa 


tccactgttg 


atctttgaaa 


tatcagcaat 


ctttgaaata 


12000 


atatcagcaa 


ccacgacacc 


taccattctc 


aaattcactc 


tataaaaggg 


taaacctttg 


12060 


cttacctcta 


tgctcactca 


caaggagaac 


aaacactcat 


eggtgetaca 


taaccgcggc 


12120 


tgcaggtcga 


cggatctgta 


cccggggatc 


cgtcgatcgt 


ttegcatgat 


tgaacaagat 


12180 


ggattgcacg 


caggttctcc 


ggccgcttgg 


gtggagaggc 


tatteggcta 


tgactgggca 


12240 


caacagacaa 


tcggctgctc 


tgatgccgcc 


gtgttccggc 


tgtcagegea 


ggggcgcccg 


12300 


gttctttttg 


tcaagaccga 


cctgtccggt 


gecctgaatg 


aactgeagga 


egaggcageg 


12360 


cggctatcgt 


ggctggccac 


gaegggegtt 


ccttgcgcag 


ctgtgctcga 


cgttgtcact 


12420 


gaagcgggaa 


gggactggct 


gctattgggc 


gaagtgccgg 


ggcaggatct 


cctgtcatct 


12480 


cacct t get c 
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ettgatcegg 


ctacctgccc 


attcgaccac 


caagegaaac 


ategcatega 


gcgagcacgt 


12600 


acteggatgg 


aagceggtet 


tgtcgatcag 


gatgatctgg 


acgaagagca 


tcaggggctc 


12660 


gcgccagccg 


aactgttcgc 


caggctcaag 


gcgcgcatgc 


ccgacggcga 


ggatctcgtc 


12720 
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gtgacccatg 


gcgatgcctg 


cttgccgaat 


atcatggtgg 


aaaatggccg 


cttttctgga 


12780 


ttcatcgact 


gtggccggct 


gggtgtggcg 


gaccgctatc 


aggacatagc 


gttggctacc 


12840 


cgtgatattg 


ctgaagagct 


tggcggcgaa 


tgggctgacc 


gcttcctcgt 


gctttacggt 


12900 


atcgccgctc 


ccgattcgca 


gcgcatcgcc 


ttctatcgcc 


ttcttgacga 


gttcttctga 


12960 


gcgggactct 


ggggttcgaa 


atgaccgacc 


aagcgacgcc 


caacctgcca 


tcacgagatt 


13020 


tcgattccac 


cgccgccttc 


tatgaaaggt 


tgggcttcgg 


aatcgttttc 


cgggacgccg 


13080 


gctggatgat 


cctccagcgc 


ggggatctca 


tgctggagtt 


cttcgcccac 


ccccggatcc 


13140 


tctagagtcg 


acctgcaggc 


gttcaaacat 


ttggcaataa 


agtttcttaa 


gattgaatcc 


13200 


tgttgccggt 


cttgcgatga 


ttatcatata 


atttctgttg 


aattacgtta 


agcatgtaat 


13260 


aattaacatg 


taatgcatga 


cgttatttat 


gagatgggtt 


tttatgatta 


gagtcccgca 


13320 


attatacatt 


taatacgcga 


tagaaaacaa 


aatatagcgc 


gcaaactagg 


ataaattatc 


13380 


gcgcgcggtg 


tcatctatgt 


tactagatcg 


ggaattgcca 


agctggcgcg 


ccctgcagac 


13440 


cggtggatcc 


ggccggccga 


tccggtcatc 


ggcgggggtc 


ataacgtgac 


tcccttaatt 


13500 


ctccgctcat 


gatcagattg 


tcgtttcccg 


ccttcagttt 


aaactatcag 


tgtttgacag 


13560 


gatatattgg 


cgggtaaacc 


taagagaaaa 


gagcgtttat 


tagaataatc 


ggatatttaa 


13620 


aagggcgtga 


aaaggtttat 


ccgttcgtcc 


atttgtatgt 


gcatgccaac 


cacagggttc 


13680 


cccagatctg 


gcgccggcca 


gcgagacgag 


caagattggc 


cgccgcccga 


aacgatccga 


13740 


cagcgcgccc 


agcacaggtg 


cgcaggcaaa 


ttgcaccaac 


gcatacagcg 


ccagcagaat 


13800 


gccatagtgg 


gcggtgacgt 


cgttcgagtg 


aaccagatcg 


cgcaggaggc 


ccggcagcac 


13860 


cggcataatc 


aggccgatgc 


cgacagcgtc 


gagcgcgaca 


gtgctcagaa 


ttacgatcag 


13920 


gggtatgttg 


ggtttcacgt 


ctggcctccg 


gaccagcctc 


cgctggtccg 


attgaacgcg 


13980 


cggattcttt 


atcactgata 


agttggtgga 


catattatgt 


ttatcagtga 


taaagtgtca 


14040 


agcatgacaa 


agttgcagcc 


gaatacagtg 


atccgtgccg 


ccctggacct 


gttgaacgag 


14100 


gtcggcgtag 


acggtctgac 


gacacgcaaa 


ctggcggaac 


ggttgggggt 


tcagcagccg 


14160 


gcgctttact 


ggcacttcag 


gaacaagcgg 


gcgctgctcg 


acgcactggc 


cgaagccatg 


14220 


ctggcggaga 


atcatacgca 


ttcggtgccg 


agagccgacg 


acgactggcg 


ctcatttctg 


14280 


at per era a at" cr 
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catgccggca 


cgcgaccggg 


cgcaccgcag 


atggaaacgg 


ccgacgcgca 


gcttcgcttc 


14400 


ctctgcgagg 


cgggtttttc 


ggccggggac 


gccgtcaatg 


cgctgatgac 


aatcagctac 


14460 


ttcactgttg 


gggccgtgct 


tgaggagcag 


gccggcgaca 


gcgatgccgg 


cgagcgcggc 


14520 



17 



ggcaccgttg 


aacaggctcc 


gctctcgccg 


ctgttgcggg 


ccgcgataga 


cgccttcgac 


14580 


gaagccggtc 


cggacgcagc 


gttcgagcag 


ggactcgcgg 


tgattgtcga 


tggattggcg 


14640 


aaaaggaggc 


tcgttgtcag 


gaacgttgaa 


ggaccgagaa 


agggtgacga 


ttgatcagga 


14700 


ccgctgccgg 


agcgcaaccc 


actcactaca 


gcagagccat 


gtagacaaca 


tcccctcccc 


14760 


ctttccaccg 


cgtcagacgc 


ccgtagcagc 


ccgctacggg 


ctttttcatg 


ccctgcccta 


14820 


gcgtccaagc 


ctcacggccg 


cgctcggcct 


ctctggcggc 


cttctggcgc 


tcttccgctt 


14880 


cctcgctcac 


tgactcgctg 


cgctcggtcg 


ttcggctgcg 


gcgagcggta 


tcagctcact 


14940 


caaaggcggt 


aatacggtta 


tccacagaat 


caggggataa 


cgcaggaaag 


aacatgtgag 


15000 


caaaaggcca 


gcaaaaggcc 


aggaaccgta 


aaaaggccgc 


gttgctggcg 


tttttccata 


15060 


ggctccgccc 


ccctgacgag 


catcacaaaa 


atcgacgctc 


aagtcagagg 


tggcgaaacc 


15120 


cgacaggact 


ataaagatac 


caggcgtttc 


cccctggaag 


ctccctcgtg 


cgctctcctg 


15180 


ttccgaccct 


gccgcttacc 


ggatacctgt 


ccgcctttct 


cccttcggga 


agcgtggcgc 


15240 


ttttccgctg 


cataaccctg 


cttcggggtc 


attatagcga 


ttttttcggt 


atatccatcc 


15300 


tttttcgcac 


gatatacagg 


attttgccaa 


agggttcgtg 


tagactttcc 


ttggtgtatc 


15360 


caacggcgtc 


agccgggcag 


gataggtgaa 


gtaggcccac 


ccgcgagcgg 


gtgttccttc 


15420 


ttcactgtcc 


cttattcgca 


cctggcggtg 


ctcaacggga 


atcctgctct 


gcgaggctgg 


15480 


ccggctaccg 


ccggcgtaac 


agatgagggc 


aagcggatgg 


ctgatgaaac 


caagccaacc 


15540 


aggaagggca 


gcccacctat 


caaggtgtac 


tgccttccag 


acgaacgaag 


agcgattgag 


15600 


gaaaaggcgg 


cggcggccgg 


catgagcctg 


tcggcctacc 


tgctggccgt 


cggccagggc 


15660 


tacaaaatca 


cgggcgtcgt 


ggactatgag 


cacgtccgcg 


agctggcccg 


catcaatggc 


15720 


gacctgggcc 


gcctgggcgg 


cctgctgaaa 


ctctggctca 


ccgacgaccc 


gcgcacggcg 


15780 


cggttcggtg 


atgccacgat 


cctcgccctg 


ctggcgaaga 


tcgaagagaa 


gcaggacgag 


15840 
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15 900 


ct aaaacggc 


cggggggtgc 


gcgtgattgc 


caagcacgtc 


cccatgcgct 


ccatcaagaa 


15960 


gagcgacttc 


gcggagctgg 


tgaagtacat 


caccgacgag 


caaggcaaga 


ccgagcgcct 


16020 


tt ccgacgct 


ca 
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<210> 6 
<211> 17460 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid 
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<220> 

<221> unsure 

<222> (1135) . . (1135) 

<223> n = a, c, g, or t/u 

<220> 

<221> unsure 

<222> (1145) . . (1145) 

<223> n = a, c, g, or t/u 

<400> 6 





atgattgaac 


aagatggatt 


gcacgcaggt 


tctccggccg 


cttgggtgga 


gaggctattc 


60 




ggctatgact 


gggcacaaca 


gacaatcggc 


tgctctgatg 


ccgccgtgtt 


ccggctgtca 


120 




gcgcaggggc 


gcccggttct 


ttttgtcaag 


accgacctgt 


ccggtgccct 


gaatgaactg 


180 




cagcgcggct 


atcgtggctg 


gccacgacgg 


gcgttccttg 


cgcagctgtg 


ctcgacgttg 


240 




tcactgaagc 


gggaagggac 


tggctgctat 


tgggcgaagt 


gccggggcag 


gatctcctgt 


300 




catctcacct 


tgctcctgcc 


gagaaagtat 


ccatcatggc 


tgatgcaatg 


cggcggctgc 


360 




atacgcttga 


tccggctacc 


tgcccattcg 


accaccaagc 


gaaacatcgc 


atcgagcgag 


420 


S-i 


cacgtactcg 


gatggaagcc 


ggtcttgtcg 


atcaggatga 


tctggacgaa 


gagcatcagg 


480 




ggctcgcgcc 


agccgaactg 


ttcgccaggc 


tcaaggcgcg 


catgcccgac 


ggcgaggatc 


540 




tcgtcgtgac 


ccatggcgat 


gcctgcttgc 


cgaatatcat 


ggtggaaaat 


ggccgctttt 


600 




ctggattcat 


cgactgtggc 


cggctgggtg 


tggcggaccg 


ctatcaggac 


atagcgttgg 


660 


i'y 


ctacccgtga 


tattgctgaa 


gagcttggcg 


gcgaatgggc 


tgaccgcttc 


ctcgtgcttt 


720 




acggtatcgc 


cgctcccgat 


tcgcagcgca 


tcgccttcta 


tcgccttctt 


gacgagttct 


780 




tctgaaattc 


atgtaaagca 


aaaattaaaa 


tatgatcatc 


caaattattt 


ttatagccaa 


840 




tttaggtaat 


aatcattttt 


attaccactc 


aaattaactt 


cttgttaaat 


gaacatacat 


900 




atgaaaaatc 


tagctcgaaa 


tgaaacccca 


agaaaatctg 


tagtcgctat 


caatggtgcg 


960 




cactgaataa 


accccttact 


gtgcagtagc 


ctacaaccca 


catattgtga 


ttattgatac 


1020 




tacataaaat 


actcttaatt 


gaagtagtac 


gccatgcaat 


catattctca 


ttttggagga 


1080 




atccaatcct 


catcaaagta 


cccaactagc 


ctttctccac 


aaggctttga 


acatnattcc 


1140 




aatanggaat 


acaatcatat 


tctcattttg 


gaggagcgat 


tttggaaatt 


aatatcacat 


1200 




gtttaatggc 


ctaaatgatg 


gcaatacgaa 


cactaaatcc 


tttcatatat 


tcacttaaaa 


1260 




aaaagtcata 


taattgcttt 


taggagagag 


ataaatgatc 


gtattgactt 


agagtatact 


1320 




ggttaaaatt 


attatatatc 


taataattaa 


tttgatgttt 


tagtaataaa 


ctaccatact 


1380 




tgttattttt 


cattttaaac 


tttactgtga 


aagacagaat 


gcacttcaga 


aaactaaata 


1440 
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tctattacac atcatctgaa atggtcgtga gtgaaccttt ttgaatatat aatttaagtt 1500 

gcaaatagag ttatttgaag aaaaaaaaaa aggaaaaaga aacagtataa tggctccttg 1560 

tcctcttagc cctccaaaaa aggaacaaat attaatttaa ttagaaaaaa acaaccttga 1620 

gcgggtcgaa ttagtaggtc tttccacatt caaattacta attttaaaaa ttattgacta 1680 

cttcaagatt ctttaatctt ttttcctttg tattgagaga gtcatcaaaa agctcactcg 1740 

ggacgtgaaa aaaaatttga cttgctaatt atattctata tttactaggt aaaatcctaa 1800 

tcttattatc acattatcac tgggattttt cggcattatt ctagctaaat atcttgtgca 1860 

attcatgtcc tccaccacaa aaaaaatgcc taattttaca acttttttgg tatgaaaaga 1920 

gagtaagaaa aaggaataag agaagtctga ttgaaagaaa taaggaagga tgaataaaaa 1980 

acaaagaaaa ttctactaga atttgtatgc cgttggatgt gaatggtttc caattttttt 2040 

gtctctcttg tttgttcaat tttaaatttc cgccaaacag acacaaaatg atccttaact 2100 

ccgctttaca agcggatagt tacgtgtctt cctctctctc ccttgttgac gtatcttaaa 2160 

aactccaaac tacccctgga tttttctaat ctttaattaa ggattatata tatatatata 2220 

tatatatata tatatatata tatatatata tatattatga taattaataa ttaaattatt 2280 

tgcacattta aaagtctatt tggattgact tattttagat gttttaaagt taaaataact 2340 

tttaaataat tttagtattc gaataaacta agaaaggtgc ttataagcac tttatgcctt 2400 

tacactacaa tgtaaaaatt aagtcaaaag ttattaaacg aacttattag tcaaaagtta 24 60 

aagcgattcc acacaagcgc tacgcttaat atttttttag ccaaagcgaa tccaaacaag 2520 

cgctacgtca atattctttt ccttttcttt aattgaacca tcaatcctag atctcacttt 2580 

ctctgacatg ggacctaaca ttaacaatat gagttgtggt gtgataattc gagattcctt 2 640 

caaccctaat aggaagtgct atagtgaaaa atgagccctc caatatccgc attcaaattt 2700 

aatccgagtt taatgtagat acaatgcata atgtgggaac cgatgggaaa gaacaaaagt 2760 

aataagatat acaatatgtt ttcacacata gtgagatcta gaaagggtag atagtatgcc 2820 

gttggtatct ctatcttaga ggtaaagtag aaaagttgtt tccaatcgat ccgaactcaa 2880 

gagaaaaaca gtttagccga tgggaaataa agaaaagaaa aactaattta ggagggtata 2940 

tatgtctttg cacaaaggct aacctaaagc cctggctcgt ataaaacgcg atcattacag 3000 

gattgcaaca taaacactca ttcaatcgaa gcttccctgt taatattatc attttgttcc 3060 

ttctttattc actcttaaac catggatggc tttgtgtgct tatgcatttc ctgggatttt 3120 

gaacaggact ggtgtggttt cagattcttc taaggcaacc cctttgttct ctggatggat 3180 

tcatggaaca gatctgcagt ttttgttcca acacaagctt actcatgagg tcaagaaaag 324 0 
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gtcacgtgtg 


gttcaggctt 


ccttatcaga 


atctggagaa 


tactacacac 


agagaccgcc 


3300 


aacgcctatt 


ttggacactg 


tgaactatcc 


cattcatatg 


aaaaatctgt 


ctctgaagga 


3360 


acttaaacaa 


ctagcagatg 


aactaaggtc 


agatacaatt 


ttcaatgtat 


caaagactgg 


3420 


gggtcacctt 


ggctcaagtc 


ttggtgttgt 


tgagctgact 


gttgctcttc 


attatgtctt 


3480 


caatgcaccg 


caagatagga 


ttctctggga 


tgttggtcat 


cagtcttatc 


ctcacaaaat 


3540 


cttgactggt 


agaagggaca 


agatgtcgac 


attaaggcag 


acagatggtc 


ttgcaggatt 


3600 


tactaagcga 


tcggagagtg 


aatatgattg 


ctttgatgag 


ttttgatatt 


gccaaatacc 


3660 


cgaccctggc 


actggtcgac 


tccacccagg 


agttacgact 


gttgccgaaa 


gagagtttac 


3720 


cgaaactctg 


cgacgaactg 


cgccgctatt 


tactcgacag 


cgtgagccgt 


tccagcgggc 


3780 


acttcgcctc 


cgggctgggc 


acggtcgaac 


tgaccgtggc 


gctgcactat 


gtctacaaca 


3840 


ccccgtttga 


ccaattgatt 


tgggatgtgg 


ggcatcaggc 


ttatccgcat 


aaaattttga 


3900 


w ccggacgccg 


cgacaaaatc 


ggcaccatcc 


gtcagaaagg 


cggtctgcac 


ccgttcccgt 


3960 


[0 ggcgcggcga 


aagcgaatat 


gacgtattaa 


gcgtcgggca 


ttcatcaacc 


tccatcagtg 


4020 


O ccggaattgg 


tattgcggtt 


gctgccgaaa 


aagaaggcaa 


aaatcgccgc 


accgtctgtg 


4080 


tcattggcga 


tggcgcgatt 


accgcaggca 


tggcgtttga 


agcgatgaat 


cacgcgggcg 


4140 


VssSi atatccgtcc 


tgatatgctg 


gtgattctca 


acgacaatga 


aatgtcgatt 


tccgaaaatg 


4200 


i !S3s tcggcgcgct 


caacaaccat 


ctggcacagc 


tgctttccgg 


taagctttac 


tcttcactgc 


4260 


[ y gcgaaggcgg 


gaaaaaagtt 


ttctctggcg 


tgccgccaat 


taaagagctg 


ctcaaacgca 


4320 


• ts % ccgaagaaca 


tattaaaggc 


atggtagtgc 


ctggcacgtt 


gtttgaagag 


ctgggcttta 


4380 


actacatcgg 


cccggtggac 


ggtcacgatg 


tgctggggct 


tatcaccacg 


ctaaagaaca 


4440 


tgcgcgacct 


gaaaggcccg 


cagttcctgc 


atatcatgac 


caaaaaaggt 


cgtggttatg 


4500 


aaccggcaga 


aaaagacccg 


atcactttcc 


acgccgtgcc 


taaatttgat 


ccctccagcg 


4560 


gttgtttgcc 


gaaaagtagc 


ggcggtttgc 


cgagctattc 


aaaaatcttt 


ggcgactggt 


4620 


tgtgcgaaac 


ggcagcgaaa 


gacaacaagc 


tgatggcgat 


tactccggcg 


atgcgtgaag 


4680 


gttccggcat 


ggtcgagttt 


tcacgtaaat 


tcccggatcg 


ctacttcgac 


gtggcaattg 


4740 


ccgagcaaca 


cgcggtgacc 


tttgctgcgg 


gtctggcgat 


tggtgggtac 


aaacccattg 


4800 


tcgcgattta 


ctccactttc 


ctgcaacgcg 


cctatgatca 


ggtgctgcat 


gacgtggcga 


4860 


ttcaaaagct 


tccggtcctg 


ttcgccatcg 


accgcgcggg 


cattgttggt 


gctgacggtc 


4920 


aaacccatca 


gggtgctttt 


gatctctctt 


acctgcgctg 


cataccggaa 


atggtcatta 


4980 


tgaccccgag 


cgatgaaaac 


gaatgtcgcc 


agatgctcta 


taccggctat 


cactataacg 


5040 
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atggcccgtc 


agcggtgcgc 


tacccgcgtg 


gcaacgcggt 


cggcgtggaa 


ctgacgccgc 


5100 


tggaaaaact 


accaattggc 


aaaggcattg 


tgaagcgtcg 


tggcgagaaa 


ctggcgatcc 


5160 


ttaactttgg 


tacgctgatg 


ccagaagcgg 


cgaaagtcgc 


cgaatcgctg 


aacgccacgc 


5220 


tggtcgatat 


gcgttttgtg 


aaaccgcttg 


atgaagcgtt 


aattctggaa 


atggccgcca 


5280 


gccatgaagc 


gctggtcacc 


gtagaagaaa 


acgccattat 


gggcggcgca 


ggcagcggcg 


5340 


tgaacgaagt 


gctgatggcc 


catcgtaaac 


cagtacccgt 


gctgaacatt 


ggcctgccgg 


5400 


acttctttat 


tccgcaagga 


actcaggaag 


aaatgcgcgc 


cgaactcggc 


ctcgatgccg 


5460 


ctggtatgga 


agccaaaatc 


aaggcctggc 


tggcataacg 


aattcccgat 


ctagtaacat 


5520 


agatgacacc 


gcgcgcgata 


atttatccta 


gtttgcgcgc 


tatattttgt 


tttctatcgc 


5580 


gtattaaatg 


tataattgcg 


ggactctaat 


cataaaaacc 


catctcataa 


ataacgtcat 


5640 


gcacctgaat 


agatcttgga 


caagcgttag 


gcctatctgt 


gcattacatg 


ttaattatta 


5700 


catgcttaac 


gtaattcaac 


agaaattata 


tgataatcat 


cgcaagaccg 


gcaacaggat 


5760 


tcaatcttaa 


gaaactttat 


tgccaaatgt 


ttgaacgatc 


ggggaaattc 


gagctcccgg 


5820 


gctggttgcc 


ctcgccgctg 


ggctggcggc 


cgtctatggc 


cctgcaaacg 


cgccagaaac 


5880 


gccgtcgaag 


ccgtgtgcga 


gacaccgcgg 


ccgccggcgt 


tgtggatacc 


tcgcggaaaa 


5940 


cttggccctc 


actgacagat 


gaggggcgga 


cgttgacact 


tgaggggccg 


actcacccgg 


6000 


cgcggcgttg 


acagatgagg 


ggcaggctcg 


atttcggccg 


gcgacgtgga 


gctggccagc 


6060 


ctcgcaaatc 


ggcgaaaacg 


cctgatttta 


cgcgagtttc 


ccacagatga 


tgtggacaag 


6120 


cctggggata 


agtgccctgc 


ggtattgaca 


cttgaggggc 


gcgactactg 


acagatgagg 


6180 


ggcgcgatcc 


ttgacacttg 


aggggcagag 


tgctgacaga 


tgaggggcgc 


acctattgac 


6240 


atttgagggg 


ctgtccacag 


gcagaaaatc 


cagcatttgc 


aagggtttcc 


gcccgttttt 


6300 


cggccaccgc 


taacctgtct 


tttaacctgc 


ttttaaacca 


atatttataa 


accttgtttt 


6360 


taaccagggc 


tgcgccctgt 


gcgcgtgacc 


gcgcacgccg 


aaggggggtg 


cccccccttc 


6420 


tcgaaccctc 


ccggcccgct 


aacgcgggcc 


tcccatcccc 


ccaggggctg 


cgcccctcgg 


6480 


ccgcgaacgg 


cctcacccca 


aaaatggcag 


cgctggcagt 


ccttgccatt 


gccgggatcg 


6540 


gggcagtaac 


gggatgggcg 


atcagcccga 


gcgcgacgcc 


cggaagcatt 


gacgtgccgc 


6600 


aggtgctggc 


atcgacattc 


agcgaccagg 


tgccgggcag 


tgagggcggc 


ggcctgggtg 


6660 


gcggcctgcc 


cttcacttcg 


gccgtcgggg 


cattcacgga 


cttcatggcg 


gggccggcaa 


6720 


tttttacctt 


gggcattctt 


ggcatagtgg 


tcgcgggtgc 


cgtgctcgtg 


ttcgggggtg 


6780 


cgataaaccc 


agcgaaccat 


ttgaggtgat 


aggtaagatt 


ataccgaggt 


atgaaaacga 


6840 
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gaattggacc 


tttacagaat 


tactctatga 


agegecatat 


1 1 aaaaagct 
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agaggatgaa 


gaggatgagg 


aggcagattg 


ccttgaatat 


attcracaat a 


n,yaL.clayciL 


D-JOU 


aatatatctt 


ttatatagaa 


gatatcgccg 


tatgt aagga 


tttcacraaaa 

u,ju yyyyy 


caacrcrcat"an 


7 09 0 


gcagcgcgct 


tatcaatata 


tctatagaat 


gggcaaagca 


taaaaactter 


<^cl Ly y au Lad 


/ u 0 u 


tgcttgaaac 


ccaggacaat 


aaccttatag 


ctt gtaaatt 


ct at cat ^ a t" 


l y y y Ldd Ly a 




ctccaactta 


ttgataqtqt 

-i ZJ zd 


tttatgttca 


crat aataccc 
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y a av — l. i_ L. y 
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agcgacttcc 
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tcaggttatg 


ccgctcaatt 


cgctgcgtat 


ategcttget 


crat t a cert ere 


Ciy L L. L LL/UvL 


7 ? 9 0 


tcaggcggga 


ttcatacagc 


ggccagccat 


ccgt cat cca 


tatcaccaccr 


LVjaaay y y 


/ JOU 


aagcaggctc 


ataagacgcc 


ccagcgtcgc 


cat agtgcgt 


tcarraaat a 


y **y Ly LaciL 


7 A A. Pi 
/ 4 4 U 


aaccgtcttc 


cggagactgt 


catacgcgta 


aaacagccag 


cactaaccrccf 


at ftsrrpppp 

ci u L Lay UL> ' 


7 ^on 

/ J L* U 


gacatagccc 


cactgttcgt 


ccatttccgc 


gcagacgatg 




uuy y L- Ly lcil 


/ jou 


gcgcgaggtt 


accgactgcg 


qcctqaqttt 


tttaagtgac 


otaaaat cert 


y u uy ay y LL-a 




acgcccataa 


tgcgqgctqt 


tgcccggcat 


ccaa caeca t 


- o. L.y y i_ 


a LLad Ly a L L 


icon 

/ oou 


tt ctggtgcg 


taccgggtt g 


a era acre erect cr 


a a. y u vj a d o l_ 


y L>ay l_ LLJ L-L-d 


Lg ltz l La egg 


I 1 4 U 


cagtgagagc 


agagatagcg 


ctgatgtccg 
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CdLLaCCCCy 
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tcagtagctg 


aacaggaggg 


acacrct Qr3 1 a 


y d d a. l_- ci y day 


uuauuyy dye 


dccucaaaaa 


Torn 


caccatcata 


cact aaatca 


ataaattaac 


aacatrarrp 
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a tela l uy uyy 


L L LCaaaaLC 


/ yzu 


ggctccgtcg 


atactatgtt 


at acgccaac 


1 1 1 craaaa fa 


civL L. Ly aaaa 


ayLLy L LL LU 


/ y 0 U 


tggtatttaa 


ggttttagaa 


tcrcaaacraac 


aertcra; = it"i~crrf 


d.y ULuy lull. 


gLLaLaaLia 


0 U 4 U 


gcttcttggg 


gtatctttaa 


atactgt aga 


aaacracrcraacr 

t-A V*-L LL C-A. <LX d ^ 


y aaa Lact Lda 


a. Lyy LLadda 


C31UU 


tgagaatatc 


accggaattg 


aaaaaactga 


tccraaaaata 


c cn c*i~ n c rrl - ^ 
> — • ^ — y^—" *— y ^— ■ y d 


ddctycl LdLyy 


OlOU 


aaggaatgtc 


tcctgctaaq 

Zf ZJ 


gtat ataagc 


t cfcrt aaaaaa 


CJ. C^L ^ U CI CA C3. CI V 


P t" 3 1" ja "H" i" a a 
LLa Ld L L Lda 




aaatgacgga 


cagccggtat 


aaagggacca 


cct at gat gt 


crcraa cerncra a 


aayyaLaLyd 


oopn 
0 z 0 u 


tgctatggct 


ggaaggaaag 


ctgcctgttc 


caaaggt cct 


crcactttnaa 


LyyLa Ly a Ly 


O O *i U 


gctggagcaa 


tctgctcatq 

Z> ZJ 


agtgaggccg 


at crcrccrt cct 


i~ trrrt' (^rrrr^i 
* — 1 — v—j v — t_ v-> y y cici 


y dy Ld l y d a y 


0 /I n Pi 
O 4 U U 


at gaacaaag 


ccctgaaaag 


attatcgagc 


tgtatgcgga 


gtgeatcagg 


ctctttcact 


8460 


ccatcgacat 


atcggattgt 


ccctatacga 


atagcttaga 


cagccgctta 


gecgaattgg 


8520 


attacttact 


gaataacgat 


ctggccgatg 


tggattgega 


aaactgggaa 


gaagacactc 


8580 


catttaaaga 


tccgcgcgag 


ctgtatgatt 


ttttaaagac 


ggaaaagece 


gaagaggaac 


8640 
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ttgtcttttc 
gtggctttat 
gcgtccggtc 
tactggggat 
tttagtacct 
cgcatcaagt 
ctggtattcg 
tacgggaccg 
tcaaatcagg 
tgaatgaatc 
tccgccgagg 
ttccagtccg 
caactggctc 
gaacaggagg 
accaagaagc 
gccgcgttgc 
attgcgccgt 
ttcaccacgc 
gtcaacaagg 
ctggtgtggc 
ttcacgttct 
aaggccgagg 
gttgggcacc 
aaaacgtccc 
cactacacga 
ttcgactatt 
atgtgcggat 
gaagagttgc 
tgcaaacgct 
ctggcatttc 



ccacggcgac 

tgatcttggg 

gat cagggag 

caagcctgat 

agatgtggcg 

gttttggctc 

tgcagggcaa 

acttcattgc 

aataagggca 

ggacgtttga 

atgccgaaac 

tcggctcgat 

cccctgccct 

cggcaggttt 

gaaaaaccgc 

tgaaacacac 

ggccggacac 

gcaacaagaa 

acgtgaagat 

agcaggtgtt 

acgagctttg 

aatgcctgtc 

tggaatcggt 

gttgccaggt 

aattcatatg 

tcagctcgca 

cggattccac 

gaggcagcgg 

agggccttgt 

aggaacaagc 



ctgggagaca 

agaagcggca 

gatatcgggg 

tgggagaaaa 

caacgatgcc 

tcaggccgag 

gattcggaat 

cgataaggtg 

cattgccccg 

ccggaaggca 

catcgcaagc 

ggtccagcaa 

gcccgcgcca 

ggcgaagtcg 

cggcgaggac 

gaagcagcag 

gatgcgagcg 

aatcccgcgc 

cacctacacc 

ggagtacgcg 

ccaggacctg 

gcgcctacag 

gtcgctgctg 

cctgatcgac 

ggagaagtac 

ccgggagccg 

ccgcgtgaag 

cctggtggaa 

ggggtcagtt 

gggcactgct 



gcaacatctt 

gggcggacaa 

aagaacagta 

taaaatatta 

ggcgacaagc 

gcccacggca 

accaagtacg 

gattatctgg 

gcgtgagtcg 

tacaggcaag 

cgcaccgtca 

gctacggcca 

tcggccgccg 

atgaccatcg 

ctggcaaaac 

atcaaggaaa 

atgccaaacg 

gaggcgctgc 

ggcgtcgagc 

aagcgcaccc 

ggctggtcga 

gcgacggcga 

caccgcttcc 

gaggaaatcg 

cgcaagctgt 

tacccgctca 

aagtggcgcg 

cacgcctggg 

ccggctgggg 

cgacgcactt 



tgtgaaagat 

gtggtatgac 

tgtcgagcta 

tattttactg 

aggagcgcac 

agtatttggg 

agaaggacgg 

acaccaaggc 

gggcaatccc 

aactgatcga 

tgcgtgcgcc 

agatcgagcg 

tggagcgttc 

acacgcgagg 

aggtcagcga 

tgcagctttc 

acacggcccg 

aaaacaaggt 

tgcgggccga 

ctatcggcga 

tcaatggccg 

tgggcttcac 

gcgtcctgga 

tcgtgctgtt 

cgccgacggc 

agctggaaac 

agcaggtcgg 

tcaatgatga 

gttcagcagc 

gcttcgctca 



ggcaaagtaa 

attgccttct 

ttttttgact 

gatgaattgt 

cgacttcttc 

caaggggtcg 

ccagacggtc 

accaggcggg 

gcaaggaggg 

cgcggggttt 

ccgcgaaacc 

cgacagcgtg 

gcgtcgtctc 

aactatgacg 

ggccaagcag 

cttgttcgat 

ctctgccctg 

cattttccac 

cgatgacgaa 

gccgatcacc 

gtattacacg 

gtccgaccgc 

ccgtggcaag 

tgctggcgac 

ccgacggatg 

cttccgcctc 

cgaagcctgc 

cctggtgcat 

cagcgcttta 

gtatcgctcg 



8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
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ggacgcacgg 
ttaaggctca 
tgtcggccct 
agcccatgga 
acggcgagat 
aggcgcatct 
tgctgctgcg 
cgggaatctg 
gcttgttgtt 
tgcagccgct 
tggcggtcct 
acgcagcgct 
tcggaaccgt 
ggcaactggc 
tcccgatgcc 
gtttaaccta 
tactgggctt 
gtcggaactt 
tcgtcaacgt 
cgatttccta 
aatgaataag 
agcaacgctc 
aaacccggca 
cgccttacaa 
ggtgattttg 
tgtggtgtaa 
tactggggtg 
ctggccctga 
ctgtttgatg 
atagggttga 



cgcgctctac 
gattcgacgg 
gaagaaagct 
ggcgttcgct 
cattgggctg 
gtccggcgtt 
ggcgttgccg 
gtggatgcgc 
tatttcggtc 
gatggtcgtg 
gggggctatt 
agatcctgtc 
gctgacccgc 
ggccggagga 
tacaggaacc 
cttcctttgg 
tctcagcccc 
cgggtccccg 
tcacttctaa 
ttatgtcggc 
aaggctgata 
tgtcatcgtt 
gcttagttgc 
cggctctccc 
tgccgagctg 
acaaattgac 
gtttttcttt 
gagagttgca 
gtggttccga 
gtgttgttcc 



gaactgccga 
cttggagcgg 
ccagagatgt 
gaacggttgc 
tcggtcttca 
ttcgtggagc 
gcgggtttat 
atcttcatcc 
taccgcctgc 
ttcatctctg 
tgcggaactg 
ggcgtcgcag 
aagtggcaac 
cttctgctcg 
aatgttctcg 
ttccggggga 
agatctgggg 
acctgtacca 
agaaatagcg 
atagttctca 
attcggatct 
acaatcaaca 
cgttcttccg 
gctgacgccg 
ccggtcgggg 
gcttagacaa 
tcaccagtga 
gcaagcggtc 
aatcggcaaa 
agtttggaac 



taaacagagg 
ccgacgtgca 
tcgggtccgt 
gagatgccgt 
aacaggagga 
ccgaacagcg 
tgctcgtgat 
tcggcgcact 
cgggcggggt 
ccgctctgct 
cgggcgtggc 
cgggcctggc 
ctcccgtgcc 
ttccagtagc 
gcctggcgtg 
tctcgcgact 
tcgatcagcc 
ttcggtgagc 
ccactcagct 
agatcgacag 
ctgcgaggga 
tgctaccctc 
aatagcatcg 
t cccggactg 
agctgttggc 
cttaataaca 
gacgggcaac 
cacgctggtt 
atcccttata 
aagagtccac 



attaaaattg 
ggatttccgc 
ttacgagcac 
ggcattcggc 
cggccccaag 
aggccgaggg 
gatcgtccga 
taatatttcg 
cgcggcgacg 
aggtagcccg 
gctgttggtg 
gggggcggtt 
tctgctcacc 
tttagtgttt 
gctcggcctg 
cgaacctaca 
ggggatgcat 
aatggatagg 
tcctcagcgg 
cctgtcacgg 
gatgatattt 
cgcgagatca 
gtaacatgag 
atgggctgcc 
tggctggtgg 
cattgcggac 
agctgattgc 
tgccccagca 
aatcaaaaga 
tattaaagaa 



acaattgtga 

gagatccgat 

gaggagaaaa 

gcctacatcg 

gacgctcaca 

gtcgccggta 

cagattccaa 

ctattctgga 

gtaggcgctg 

atacgattga 

ttgacaccaa 

tccatggcgt 

tttaccgcct 

gatccgccaa 

atcggagcgg 

gttgtttcct 

caggccgaca 

ggagttgata 

ctttatccag 

ttaagcgaga 

gatcacaggc 

tccgtgtttc 

caaagtctgc 

tgtatcgagt 

caggatatat 

gtttttaatg 

ccttcaccgc 

ggcgaaaatc 

atagcccgag 

cgtggactcc 



10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
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aacgtcaaag ggcgaaaaac cgtctatcag ggcgatggcc cactacgtga accatcaccc 12300 

aaatcaagtt ttttggggtc gaggtgccgt aaagcactaa atcggaaccc taaagggagc 12360 

ccccgattta gagcttgacg gggaaagccg gcgaacgtgg cgagaaagga agggaagaaa 12420 

gcgaaaggag cgggcgccat tcaggctgcg caactgttgg gaagggcgat cggtgcgggc 124 80 

ctcttcgcta ttacgccagc tggcgaaagg gggatgtgct gcaaggcgat taagttgggt 12540 

aacgccaggg ttttcccagt cacgacgttg taaaacgacg gccagtgaat tgaagcttgg 12600 

cgcgccaagc ttggtaccct caggactaga ccatcgtggt taaatgatca agtgcctact 12660 

tggcagaatt tctttcgagc agcctcctcc tacaagttgc atttgttgcg cttacgataa 12720 

ttgtcaaaga agtaggtaaa ataaagacat gatctctaat attaaggata agattaaaaa 12780 

taagtccagg attacccggt cggcccatca attacttgct gacctttgtt gccgtcccac 12840 

gacttccatt ttctaaccgt ccattttcat ttgtttttag ctatatttaa tattaatggg 12900 

atataaatta taaacattcc tcctcccaaa aaaataagtt taagtaatac tgcaatagac 12960 

agtgttttaa gccatgtaat tcagtaaaag ttctttttta ttctgaacct agccctaaaa 13020 

aggccatgcg ggtaattagt tcagtcaact gaatatacaa cgttttgaac caaagttaac 13080 

atgtacaggc caatagaagt tatttgaccg taagcttagt ctctacattc attcaacgtt 13140 

cttgaatcaa agtgacctgt acaggccaat agaagttacc tgaccgtaag cttagtctct 13200 

acattcattc ctctgagacg atattctaga agcctgcttt caagtctaaa aggcacaatc 13260 

ttttctcctc accacttgtt gaggtactta tgattttaaa gatgaaacat tttttttact 13320 

tttccccttt aatttctttg attttttttt ttcttggtag ttggaagtac ttttcatacc 13380 

ctagaaaatc cactgttgat ctttgaaata tcagcaatct ttgaaataat atcagcaacc 13440 

acgacaccta ccattctcaa attcactcta taaaagggta aacctttgct tacctctatg 13500 

ctcactcaca aggagaacaa acactcatcg gtgctacata accgcggctg caggtcgacg 13560 

gatctgtacc cggggatccg tcgatcgttt cgcatgattg aacaagatgg attgcacgca 13620 

ggttctccgg ccgcttgggt ggagaggcta ttcggctatg actgggcaca acagacaatc 13680 

ggctgctctg atgccgccgt gttccggctg tcagcgcagg ggcgcccggt tctttttgtc 13740 

aagaccgacc tgtccggtgc cctgaatgaa ctgcaggacg aggcagcgcg gctatcgtgg 13800 

ctggccacga cgggcgttcc ttgcgcagct gtgctcgacg ttgtcactga agcgggaagg 138 60 

gactggctgc tattgggcga agtgccgggg caggatctcc tgtcatctca ccttgctcct 13920 

gccgagaaag tatccatcat ggctgatgca atgcggcggc tgcatacgct tgatccggct 13980 

acctgcccat tcgaccacca agcgaaacat cgcatcgagc gagcacgtac tcggatggaa 14040 
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gccggtcttg tcgatcagga tgatctggac gaagagcatc aggggctcgc gccagccgaa 14100 
ctgttcgcca ggctcaaggc gcgcatgccc gacggcgagg atctcgtcgt gacccatggc 14160 
gatgcctgct tgccgaatat catggtggaa aatggccgct tttctggatt catcgactgt 14220 
ggccggctgg gtgtggcgga ccgctatcag gacatagcgt tggctacccg tgatattgct 14280 
gaagagcttg gcggcgaatg ggctgaccgc ttcctcgtgc tttacggtat cgccgctccc 14340 
gattcgcagc gcatcgcctt ctatcgcctt cttgacgagt tcttctgagc gggactctgg 14400 
ggttcgaaat gaccgaccaa gcgacgccca acctgccatc acgagatttc gattccaccg 144 60 
ccgccttcta tgaaaggttg ggcttcggaa tcgttttccg ggacgccggc tggatgatcc 14520 
tccagcgcgg ggatctcatg ctggagttct tcgcccaccc ccggatcctc tagagtcgac 14580 
ctgcaggcgt tcaaacattt ggcaataaag tttcttaaga ttgaatcctg ttgccggtct 14640 
tgcgatgatt atcatataat ttctgttgaa ttacgttaag catgtaataa ttaacatgta 14700 
atgcatgacg ttatttatga gatgggtttt tatgattaga gtcccgcaat tatacattta 147 60 
atacgcgata gaaaacaaaa tatagcgcgc aaactaggat aaattatcgc gcgcggtgtc 14820 
atctatgtta ctagatcggg aattgccaag ctggcgcgcc ctgcagaccg gtggatccgg 14880 
ccggccgatc cggtcatcgg cgggggtcat aacgtgactc ccttaattct ccgctcatga 14940 
tcagattgtc gtttcccgcc ttcagtttaa actatcagtg tttgacagga tatattggcg 15000 
ggtaaaccta agagaaaaga gcgtttatta gaataatcgg atatttaaaa gggcgtgaaa 15060 
aggtttatcc gttcgtccat ttgtatgtgc atgccaacca cagggttccc cagatctggc 15120 
gccggccagc gagacgagca agattggccg ccgcccgaaa cgatccgaca gcgcgcccag 15180 
cacaggtgcg caggcaaatt gcaccaacgc atacagcgcc agcagaatgc catagtgggc 15240 
ggtgacgtcg ttcgagtgaa ccagatcgcg caggaggccc ggcagcaccg gcataatcag 15300 
gccgatgccg acagcgtcga gcgcgacagt gctcagaatt acgatcaggg gtatgttggg 15360 
tttcacgtct ggcctccgga ccagcctccg ctggtccgat tgaacgcgcg gattctttat 15420 
cactgataag ttggtggaca tattatgttt atcagtgata aagtgtcaag catgacaaag 15480 
ttgcagccga atacagtgat ccgtgccgcc ctggacctgt tgaacgaggt cggcgtagac 15540 
ggtctgacga cacgcaaact ggcggaacgg ttgggggttc agcagccggc gctttactgg 15600 
cacttcagga acaagcgggc gctgctcgac gcactggccg aagccatgct ggcggagaat 15660 
catacgcatt cggtgccgag agccgacgac gactggcgct catttctgat cgggaatgcc 15720 
cgcagcttca ggcaggcgct gctcgcctac cgcgatggcg cgcgcatcca tgccggcacg 15780 
cgaccgggcg caccgcagat ggaaacggcc gacgcgcagc ttcgcttcct ctgcgaggcg 15840 
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ggtttttcgg 


ccggggacgc 


cgtcaatgcg 


ctgatgacaa 


t cagctactt 


cactgttggg 


15900 


gccgtgctt g 


aggagcaggc 


cggcgacagc 


gat geeggeg 


agegeggegg 


caccgttgaa 


15960 


caggct ccgc 


tct cgccgct 


gt tgegggee 


gcgat agacg 


cct t eg a eg a 


ageeggt ccg 


16020 


craccrcacrccrt 


t ccracrcacrcrcr 


actcgeggt g 


at t gt cgat g 


gat t ggcgaa 


aaggaggctc 


16080 


gt t gt cagga 


acgttgaagg 


accgagaaag 


ggtgacgatt 


gatcaggacc 


getgeeggag 


16140 


cgcaacccac 


t cactacagc 


agagecatgt 


agacaacat c 


ccctccccct 


tt ccaccgcg 


16200 


t cagacgc cc 


gt agcagccc 


getaeggget 


tttt catgee 


ctgccctagc 


gtccaagcct 


16260 


cacggccgcg 


ct cggcctct 


ctggcggcct 


t ctggcgctc 


ttccgcttcc 


t cgctcactg 


16320 


act cgctgcg 


ct cggtcgtt 


cggctgcggc 


gageggtate 


agctcactca 


aaggcggtaa 


16380 


tacggttatc 


cacagaat ca 


ggggat aacg 


caggaaagaa 


catgtgagca 


aaaggecage 


16440 


aaaacrcrccacf 


gaaccgt aaa 


aaggccgcgt 


t get ggcgt t 


ttt ccatagg 


ct ccgccccc 


16500 


rtrrarrrafrra 

^ l vjciv^ycLy^ci 


tracaaaaat 

i— L^ LA \— > LA LJ. LA La LA L- 


ccraccrct caa 


at c a era cert cr 

VA V- w L*t LA Lj Lj 1— 


acaaaaccccr 

LJ L-J" LA LA LA 'w^ ' 


acaacract at 

LA L^ t^t ^ V-J <— * L^ L- LA L- 


16560 


aaaaatacca 

U Ui LA LA LA I— LA LA 


acrccrtttccc 

^ M L- L- L- V-' *j 


cctggaagct 


c cc tcgtgcg 


ctct cctgtt 


ccgaccctgc 


16620 


ccrcttRppcrci 


atarrtairc 

LA < i LA \_> V * L- L-j L- V * *. * 


CJGCtttctCG 


ctt ccrcrcraacf 

V^* L- L- V^- \J <A V-J LA LA L4 


cgt ggeget t 


tt ccget gca 


16680 


j"aarpr , i"rfr''t' 


l^ vJ LA la VJ L- x—' LA l- 


d l^. v-J *w y Ol lw 


ttttcaatat 


at ccat cctt 

LA L- ' LA L- V> L* L* 


tt t egcaega 


16740 


tatap^crcrat 


L, L, L* Lj O Cc CA LA la 


rjcrttpcrtcfta 


cractttcctt 

W LA V^r L- L- L- WO O L- 


cat at at era 

LJ L^ L- \J L-r LA L> L-^ LA 


a ego cct caa 

LA L-^ LJ V-j \_*r LJ L* LA L^ 


16800 


pprrrYn'pa rrrr a 


t~ a rrrri" rr^ a rrt* 

LA *J m lA LA LA L-j 1— 




crcCTacrccrcfcrt 

L^ V-J LA L^ L-^ LJ L-J L- 


attccttctt 

*wj L- L^ L^* L- L- L- W 


cact gt ccct 


16860 


tattcQcacc 


t cr cr c cr cr t cr c t 

L^ Lj Li O 1 L- 1 Lp-» 


caaccrcraaat 


ectget ctgc 


gaggctggee 


ggct accgcc 


16920 


crnpcrt paran 


a t* cracrcrcrr'aa 

Cl VJ d \-J V^-, Ci c*. 


crccrcfatcrGct 


cratcraaacca 

LJ LA L- L-J LA LA LA L-* LA 


agccaaccag 


gaagggcagc 


16980 


\^ CI l — ' ^ CL \ ^ 


a rrrr t" cri - aci" cr 


rrttrraaac 


aaacCTaacracr 

LA LA V-J LA LA LJ LA ^A 


cgattgagga 


aaaggcggcg 


17040 


\J V-J \J \ — ' \—s VJ \^ Cc 


1 - cracrcr i t"a1 - G 

\^ k_H^ I*— ^ 


cr erect acct cr 


ctggccgt eg 


gecaggget a 


c aaa at cacg 


17100 


frrrpn't p nt~ rrrr 


actatcraaca 


cgtccgcgag 


ct ggcccgca 


teaatggega 


cct gggcege 


17160 


ctaacrcaacc 

L-* k— L^ LJ O 1 L-* V— ' 


t get gaaact 


ct ggct cacc 


gacgacccgc 


gcacggcgcg 


gttcggtgat 


17220 


gccacgatcc 


tcgccctgct 


ggegaagate 


gaagagaagc 


aggacgagct 


tggcaaggtc 


17280 


atgatgggcg 


tggtccgccc 


gagggcagag 


ccatgacttt 


tttagecget 


aaaacggccg 


17340 


gggggtgcgc 


gtgattgeca 


agcacgtccc 


catgcgctcc 


atcaagaaga 


gegaett cgc 


17400 


ggagctggtg 


aagtacatca 


ccgacgagca 


aggcaagacc 


gagegecttt 


ccgacgctca 


17460 



<210> 7 

<211> 24 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide primer 

<400> 7 

gtcccaatcc accatgcaca tcag 



<210> 8 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
oligonucleotide primer 

<400> 8 

ccctcgacaa atgcaaaatg tatc 



<210> 9 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
oligonucleotide primer 

<400> 9 

gatccgctat ggatctttta tc 



<210> 10 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
oligonucleotide primer 

<400> 10 

atctaatcgt tctttctttg ac 



<210> 11 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
oligonucleotide primer 



<400> 11 

gcgccgctat ttactcga 
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<210> 12 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide primer 



<400> 12 

tttctctggc gtgccgcc 



18 



